Table 2.

Discriminative Performance (ROC-AUC), Calibration, and Brier Scores for the NoMicro and NeedMicro Predictive Models Under Internal (Emergency Department) and External (Primary Care) Validation

ModelROC-AUC (95% CIa)Calibration Decile Linear
Fit R2 (95% CIa)
Scaled Brier Score (95% CIa)
Primary CarebEmergency
Departmentc
Primary CarebEmergency
Departmentc
Primary CarebEmergency
Departmentc
NoMicro/XGB0.84 (0.8-0.88)0.86 (0.86-0.87)0.98 (0.83-0.98)>0.99 (0.99-1.0)0.34 (0.25-0.42)0.34 (0.33-0.36)
NoMicro/RF0.85 (0.81-0.89)0.85 (0.84-0.85)0.94 (0.77-0.97)>0.99 (0.98-1.0)0.37 (0.27-0.46)0.3 (0.28-0.32)
NoMicro/ANN0.85 (0.81-0.89)0.86 (0.85-0.86)0.97 (0.86-0.98)>0.99 (0.99-1.0)0.35 (0.26-0.43)0.33 (0.32-0.35)
NeedMicro/XGBNAd0.88 (0.87-0.88)NAd>0.99 (0.99-1.0)NAd0.4 (0.38-0.42)
  • ANN = artificial neural networks; AUC = area under the curve; NA = not applicable; R2 = coefficient of determination; RF = random forests; ROC = receiver operating characteristic; XGB = extreme gradient boosting (XGBoost).

  • a Estimate and 95% CI values across 2,000 stratified (by pathogenicity) bootstrap replicates using the percentage method.

  • b External validation on the primary care data set.

  • c Internal validation on the emergency department data set.

  • d The NeedMicro classifier cannot be validated on the primary care data set because urine microscopy data are not available for almost all records.