Table 3.

Cutoff-Varying Performance Metrics: Sensitivity, Specificity, Positive Predictive Value, Negative Predictive Value, Likelihood Ratios, and Diagnostic Odds Ratio

Model	Threshold	Performance Metric Estimate, % (95% CI^{^a})				Performance Metric Estimate, % (95% CI^{^a})
Model	Threshold	Sensitivity	Specificity	PPV	NPV	LR+	LR−	DOR
External validation: primary care data set
NoMicro/XGB	Best	72.7 (64.8-80.5)	82.8 (78.8-86.9)	61.2 (55.3-67.7)	89.1 (86.2-92.0)	4.24 (3.32-5.62)	0.33 (0.24-0.43)	12.8 (8.1-21.5)
NoMicro/RF	Best	78.9 (71.9-85.2)	81.4 (77.6-85.5)	61.2 (56.0-67.3)	91.2 (88.4-93.8)	4.24 (3.42-5.53)	0.26 (0.18-0.35)	16.4 (10.2-28.4)
NoMicro/ANN	Best	78.1 (71.1-85.2)	78.2 (73.5-82.6)	57.1 (51.8-62.7)	90.6 (87.8-93.3)	3.58 (2.89-4.52)	0.28 (0.19-0.37)	12.8 (8.3-21.5)
NoMicro/XGB	Sen85	85.2 (78.9-90.6)	62.8 (57.6-68.0)	46.0 (42.5-50.0)	91.9 (88.9-95.0)	2.29 (1.99-2.69)	0.24 (0.14-0.34)	9.7 (6.1-17.9)
NoMicro/RF	Sen85	85.2 (78.9-90.6)	66.0 (60.8-70.9)	48.2 (44.1-52.6)	92.3 (89.1-95.1)	2.50 (2.12-2.98)	0.23 (0.14-0.33)	11.1 (6.6-20.0)
NoMicro/ANN	Sen85	85.2 (78.9-90.6)	59.6 (54.1-64.5)	44.0 (40.3-47.7)	91.5 (88.1-94.7)	2.11 (1.82-2.45)	0.25 (0.15-0.36)	8.5 (5.1-15.5)
Internal validation: emergency department data set
NoMicro/XGB	Best	80.0 (78.7-81.3)	76.3 (75.6-77.1)	49.1 (48.2-50.0)	93.0 (92.6-93.5)	3.38 (3.27-3.50)	0.26 (0.25-0.28)	12.9 (11.7-14.2)
NoMicro/RF	Best	70.6 (69.1-72.0)	83.1 (82.4-83.8)	54.4 (53.2-55.5)	90.8 (90.4-91.3)	4.18 (3.99-4.38)	0.35 (0.34-0.37)	11.8 (10.8-12.9)
NoMicro/ANN	Best	78.6 (77.2-79.9)	77.3 (76.6-78.1)	49.7 (48.8-50.6)	92.7 (92.2-93.1)	3.47 (3.35-3.59)	0.28 (0.26-0.3)	12.5 (11.5-13.7)
NeedMicro/XGB	Best	76.1 (74.6-77.5)	83.7 (83.0-84.3)	57.1 (56.0-58.1)	92.5 (92.0-92.9)	4.66 (4.47-4.87)	0.29 (0.27-0.3)	16.3 (14.9-17.8)
NoMicro/XGB	Sen85	85.0 (83.8-86.1)	70.5 (69.7-71.3)	45.1 (44.3-45.8)	94.3 (93.9-94.7)	2.88 (2.79-2.97)	0.21 (0.2-0.23)	13.6 (12.3-15.0)
NoMicro/RF	Sen85	85.1 (83.9-86.2)	64.4 (63.6-65.3)	40.6 (39.9-41.2)	93.8 (93.3-94.3)	2.39 (2.33-2.46)	0.23 (0.21-0.25)	10.3 (9.4-11.4)
NoMicro/ANN	Sen85	85.0 (83.8-86.2)	69.5 (68.7-70.3)	44.3 (43.5-45.0)	94.2 (93.8-94.7)	2.79 (2.71-2.87)	0.22 (0.2-0.23)	12.9 (11.7-14.3)
NeedMicro/XGB	Sen85	85.0 (83.8-86.2)	73.1 (72.4-73.9)	47.4 (46.6-48.2)	94.5 (94.1-94.9)	3.17 (3.07-3.27)	0.21 (0.19-0.22)	15.5 (14.1-17.1)

ANN = artificial neural networks; Best = threshold maximizing the Youden index (sensitivity + specificity − 1); DOR = diagnostic odds ratio (ratio of LR+ to LR−); LR− = negative likelihood ratio;
LR+ = positive likelihood ratio; NPV = negative predictive value; PPV = positive predictive value; RF = random forests; Sen85 = threshold obtained by requiring the greatest specificity such that sensitivity is >85% (ie, false negative rate is <15%); XGB = extreme gradient boosting (XGBoost).
↵a Estimate and 95% CI values across 2,000 stratified bootstrap replicates using the percentage method.