Abstract
PURPOSE Accurate diagnosis of urinary tract infection in children is essential because children left untreated can experience permanent renal injury. We aimed to assess the diagnostic value of clinical features of pediatric urinary tract infection.
METHODS We performed a systematic review and meta-analysis of diagnostic test accuracy studies in ambulatory care. We searched the PubMed, Embase, Web of Science, Cumulative Index to Nursing and Allied Health Literature, Cochrane Central Register of Controlled Trials, Health Technology Assessment, and Database of Abstracts of Reviews of Effects databases from inception to January 27, 2020 for studies reporting 2 × 2 diagnostic accuracy data for clinical features compared with urine culture in children aged <18 years. For each clinical feature, we calculated likelihood ratios and posttest probabilities of urinary tract infection. To estimate summary parameters, we conducted a bivariate random effects meta-analysis and hierarchical summary receiver operating characteristic analysis.
RESULTS A total of 35 studies (N = 78,427 patients) of moderate to high quality were included, providing information on 58 clinical features and 6 prediction rules. Only circumcision (negative likelihood ratio [LR–] 0.24; 95% CI, 0.08-0.72; n = 8), stridor (LR– 0.20; 95% CI, 0.05-0.81; n = 1), and diaper rash (LR– 0.13; 95% CI, 0.02-0.92; n = 1) were useful for ruling out urinary tract infection. Body temperature or fever duration showed limited diagnostic value (area under the receiver operating characteristic curve 0.61; 95% CI, 0.47-0.73; n = 16). The Diagnosis of Urinary Tract Infection in Young Children score, Gorelick Scale score, and UTIcalc (https://uticalc.pitt.edu) might be useful to identify children eligible for urine sampling.
CONCLUSIONS Few clinical signs and symptoms are useful for diagnosing or ruling out urinary tract infection in children. Clinical prediction rules might be more accurate; however, they should be validated externally. Physicians should not restrict urine sampling to children with unexplained fever or other features suggestive of urinary tract infection.
- primary care issues
- urinary tract problems
- special population: children/infants
- special population: adolescents
- quantitative methods: meta-analysis
- diagnostic testing
INTRODUCTION
Urinary tract infections (UTIs) are common, especially in very young children. The prevalence of UTI in acutely ill children aged <5 years and presenting to the family physician is almost 6%.1 It remains unclear which children should undergo testing for UTI.2 In ambulatory care, more than one-half of UTIs in children can be missed at first contact.3,4 However, early diagnosis is essential because missed episodes can progress to more serious infections, cause kidney scarring, and might suggest underlying urinary tract malformations. Up to 15% of children will have permanent renal injury after a first febrile UTI.5 This can cause impaired renal growth, recurrent pyelonephritis, renal hypertension, or end-stage renal disease, which can be prevented by prompt antibiotic treatment.6-8
Urinary tract infections often remain undetected in children, especially in infants, given their inability to verbally describe symptoms and the difficulty of obtaining a clean urine sample. Neonates with UTI are at high risk of developing bacteremia.9 For these patients, parents might report fever, irritability, lethargy, vomiting, or poor feeding.10 These symptoms also occur with other conditions such as gastroenteritis, tonsillitis, or otitis. For older children, signs are more indicative of a urinary cause such as dysuria or frequency. Current guidelines recommend urine sampling for all young children presenting with an unexplained fever of >24 hours or for older children with urinary symptoms.7,8,11
Two systematic reviews, including 1 meta-analysis with searches up to 2007, have been published.2,12 The aim of the present review was to collate the most recent evidence on the diagnostic value of signs and symptoms for pediatric UTI, to assess the probability of UTI before urine sampling.
METHODS
The study protocol was registered a priori with the Prospero registry (ID CRD42019122174). We report this study according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines (Supplemental Appendix 1, https://www.AnnFamMed.org/lookup/suppl/doi:10.1370/afm.2684/-/DC1).
Information Sources and Search Strategy
Seven electronic databases (PubMed, Embase, Web of Science, Cumulative Index to Nursing and Allied Health Literature, Cochrane Central Register of Controlled Trials, Health Technology Assessment, and Database of Abstracts of Reviews of Effects) were searched from inception for articles on the diagnosis of UTIs in children in ambulatory care (Supplemental Appendix 2, https://www.AnnFamMed.org/lookup/suppl/doi:10.1370/afm.2684/-/DC1). We conducted the first search on January 16, 2019, which was updated on January 27, 2020. We also checked the references of primary studies and reviews. Five reviewers (H.B., T.S., J.V., A.VdB., A.G.) independently selected studies in pairs, and 2 reviewers (J.V., A.VdB.) resolved conflicts. We deduplicated studies in Endnote X8.2 (Clarivate) and used Covidence (Veritas Health Innovation) for study selection.
Eligibility Criteria
We included all studies that compared the diagnostic accuracy of clinical features in children <18 years of age, with urine culture as the reference standard. Eligible study designs included prospective cross-sectional diagnostic accuracy studies, diagnostic nested case-control studies, and retrospective cohort studies. We only selected studies in the ambulatory care setting, which was defined as outpatient medical care and included family practices, emergency departments, walk-in clinics, health centers, and outpatient hospital departments.
We excluded case-control studies with a differential sampling scheme for cases and controls, reviews, letters, comments, and conference abstracts. We also excluded studies with a total sample size <50 children because those studies are prone to selection bias,13,14 and we excluded studies with children from high-risk groups (malnourished, premature). We did not apply any language, country, or time restrictions.
Data Collection
We extracted data in duplicate (H.B., A.G.) and imported the data to Excel (Microsoft Corp). In the case of incomplete or missing data, we contacted the authors for additional information (n = 34, of whom 3 authors provided unpublished data).15-17 For cells with a zero value, we applied a 0.5 continuity correction.
Risk of Bias and Applicability Assessment
We assessed the risk of bias with the Quality Assessment of Diagnostic Accuracy Studies tool (QUA-DAS-2) using RevMan version 5.3 (Cochrane). H.B. assessed the risk of bias and applicability, A.G. checked it independently, and disagreements were resolved during a consensus meeting (H.B., T.S., A.VdB., J.V.). We referred to the urine culture criteria used in the European Association of Urology guideline to assess the reference standard bias.7 Studies in which children did not systematically undergo urine sampling were considered at high risk of bias for flow and timing.
Data Analysis
We used R statistical software version 3.5.1 (R Foundation, mada package version 0.8.5) to calculate the likelihood ratios and posttest probabilities (positive and negative predictive values) of UTI, graphically displaying the change in probability using dumbbell plots (GitHub; Susannah Fleming [https://github.com/susan-nahf]).18 We considered tests to be useful for ruling out UTI if the negative likelihood ratio (LR–) was ≤0.25 (eg, substantially decreasing the likelihood of UTI). Tests were useful as a warning sign or red flag (eg, substantially increasing the likelihood of UTI) if the positive likelihood ratio (LR+) was ≥4.19,20 Signs with LR+ from 2 to 4 or LR– from 0.25 to 0.5 were considered amber signs (eg, moderately increasing or decreasing the risk of UTI).
We estimated summary parameters using a bivariate random effects meta-analysis whenever ≥3 primary studies were available.18,21 For continuous test results, we conducted a meta-analysis allowing for multiple thresholds per study to be included, while displaying results in a hierarchical summary receiver operating characteristic curve (diagmeta package in R version 0.4).22
To assess statistical heterogeneity, we examined forest plots and performed subgroup analyses via metaregression if ≥10 studies were available for this analysis. We performed subgroup analyses for design, population, age, setting, and urine collection method. We also conducted sensitivity analyses to check the robustness of our results whenever we suspected clinical heterogeneity.
RESULTS
Study Selection
We screened 10,764 studies by title and abstract, of which we evaluated 331 in full text. Ultimately, we included 35 studies on the accuracy of 58 clinical features4,15-17,23-49 and 6 prediction rules (Supplemental Appendix 3, https://www.AnnFamMed.org/lookup/suppl/doi:10.1370/afm.2684/-/DC1).15,16,41,42,50-53 None of these studies reported on pyelonephritis or bacteremia separately but rather reported on UTI as a composite outcome.
Study Characteristics
Study characteristics are shown in Supplemental Appendix 4 (https://www.AnnFamMed.org/lookup/suppl/doi:10.1370/afm.2684/-/DC1). The total number of included patients was 78,427, ranging from 75 to 15,801 patients. Twenty-four studies were conducted at the emergency department.* Other settings included health centers (n = 3),26,28,35 hospital outpatient departments (n = 2),34,38 family practices and emergency departments (n = 4),4,15,41,48 and pediatricians’ offices (n = 1).37
Authors used different inclusion criteria, and UTI prevalence ranged from 1.3% to 63.5%,15,38 with a median of 10%. Most studies included children aged <5 years (n = 24).† Two studies included only acutely ill children,4,41 whereas 12 studies included only febrile children.‡ Four studies included children with unexplained fever,17,39,50,52 8 studies included children with features of UTI,15,26,31,32,34,38,46,48 and 9 studies included children for whom urine samples were obtained at the physician’s discretion.23-25,30,33,45,47,49,51
Most studies used catheterization (n = 23), suprapubic aspiration (n = 17), or midstream catch (n = 14) to sample urine, and fewer studies used clean catch (n = 7), bag specimens (n = 5), or diaper pads (n = 2). All studies used urine culture as the reference standard.
Risk of Bias and Applicability
The risk-of-bias assessment is shown in Supplemental Appendix 5 (https://www.AnnFamMed.org/lookup/suppl/doi:10.1370/afm.2684/-/DC1). The overall risk of bias was moderate to high. The high risk of selection bias was caused by retrospective sampling (n = 9),16,24,25,33,44,45,47,51,52 recruiting a convenience or non-consecutive sample (n = 5),23,27,30,40,42 and including a narrow spectrum of patients (n = 4).26,31,46,48 Studies were considered at high risk of bias for reference standard when the positivity threshold was not adapted to the sampling method (n = 5)17,27,32,36,52 or was lower than the recommended threshold (n = 3).26,31,47 A high risk of bias for flow and timing was assumed when a urine sample was obtained from a small proportion of included children (n = 3)41,43,53 or for inappropriate exclusions from the analyses (n = 2).26,40
Diagnostic Accuracy of Signs and Symptoms
Likelihood ratios and posttest disease probabilities are shown in Figure 1 and Supplemental Appendices 4 and 5 (https://www.AnnFamMed.org/lookup/suppl/doi:10.1370/afm.2684/-/DC1). Summary estimates are shown in Table 1. Because we suspected low applicability to our research question by the study of Pylkkänen et al,38 we excluded that study from the meta-analysis.
Likelihood ratios and posttest disease probabilities for urinary symptoms (dumbbell plots).
CC = clean-catch urine samples; D = diaper urine samples; LR– = negative likelihood ratio; LR+ = positive likelihood ratio; UTI = urinary tract infection.
a Data from Pylkkänen et al were not included in the meta-analysis.
Likelihood ratios and posttest disease probabilities for urine appearance (dumbbell plots).
CC = clean-catch urine samples; D = diaper urine samples; LR– = negative likelihood ratio; LR+ = positive likelihood ratio; UTI = urinary tract infection.
a Data from Pylkkänen et al were not included in the meta-analysis.
Likelihood ratios and posttest disease probabilities for clinical examination features (dumbbell plots).
CC = clean-catch urine samples; D = diaper urine samples; LR– = negative likelihood ratio; LR+ = positive likelihood ratio; UTI = urinary tract infection.
a Data from Pylkkänen et al were not included in the meta-analysis.
Diagnostic Accuracy of Clinical Features for Urinary Tract Infection (Summary Estimates)
(1) Ruling out UTI
Being circumcised (LR– 0.24; 95% CI, 0.08-0.72; n = 8)15,16,30,34,37,39,42,46 the presence of stridor (LR– 0.20; 95% CI, 0.05-0.81),43 and the presence of diaper rash (LR– 0.13; 95% CI, 0.02-0.92)15 substantially decreased the likelihood of UTI.
In febrile children, finding an apparent source of infection (LR– 0.35; 95% CI, 0.22-0.55) decreased the probability of UTI; however, this was not useful for ruling out UTI by itself (ie, the LR– was not ≤0.25).16,29,39,43
(2) Red flags for UTI
Cloudy urine (LR+ 4.55; 95% CI, 3.73-5.56; n = 4)15,23,31,32 and malodorous urine (LR+ 4.13; 95% CI, 2.27-7.49; n = 4)15,16,26,46 were red flags for UTI. Suprapubic tenderness, loin tenderness, capillary refill time >3 seconds, and no fluid intake were useful for ruling in UTI on the basis of 1 study each (LR+ 7.94, 95% CI, 3.18-19.86; LR+ 16.63, 95% CI, 3.30-83.86; LR+ 4.80, 95% CI, 2.16-10.60; and LR+ 4.39, 95% CI, 1.72-11.20, respectively).15,43
With regard to hematuria, it was not possible to draw firm conclusions because of heterogeneity, although specificity appeared to be high (Table 1). In 1 low-prevalence study, the LR+ was 6.27 (95% CI, 1.47-26.71), whereby hematuria might be considered a red flag.15
Dysuria (LR+ 3.28; 95% CI, 2.22-4.86; n = 7)4,15,26,28,33,35,46 and frequency (LR+ 2.21; 95% CI, 1.78-2.75; n = 4)4,15,26,33 moderately increased the probability of UTI (ie, LR+ 2-4), as did darker urine, bed wetting when previously dry, previous UTI, and genitourinary abnormalities.
The following signs did not change the posttest probability and therefore might have no diagnostic value for UTI: diarrhea, vomiting, abdominal pain, poor feeding, poor weight gain, irritability, abnormal appearance, and shivering.
(3) Body temperature and fever duration
Body temperature was not associated with UTI (area under the receiver operating characteristic curve 0.61; 95% CI, 0.47-0.73) on the basis of 16 studies* (Figure 2). In addition, fever duration >24, 48, or 72 hours or 5 days was not useful on the basis of 1 study each (Supplemental Appendix 6).
HSROC curve analysis of body temperature for urinary tract infection.
AUC = area under the ROC curve; HSROC = hierarchical summary receiver operating characteristic; ROC = receiver operating characteristic.
Note: HSROC curve analysis of body temperature for urinary tract infection in children, showing sensitivity vs 1-specificity at each threshold. The thresholds provided in primary studies are indicated on the graph. The CIs of the estimates are indicated as dashed lines. (Sample size = 43,570, including data from 16 primary studies).
Diagnostic Accuracy of Prediction Rules
An overview of the prediction rules is provided in Figure 3 and Supplemental Appendix 7 (https://www.AnnFamMed.org/lookup/suppl/doi:10.1370/afm.2684/-/DC1). Seven studies identified with the initial search strategy reported on the diagnostic accuracy of a combination of signs and symptoms for UTI, of which 3 studies were included in the meta-analyses for clinical features.15,16,42 A Diagnosis of Urinary Tract Infection in Young Children (DUTY) clean-catch score <1 point was useful for ruling out UTI (LR– 0.05; 95% CI, 0-0.82) in children aged <5 years.15 In girls aged <2 years with unexplained fever, the Gorelick Scale score was useful for ruling out UTI when <2 of 5 variables were present (LR– 0.11; 95% CI, 0.01-0.81).51 Using the UTIcalc score (https://uticalc.pitt.edu), the probability of UTI decreased to <2% in all circumcised boys except non–African American infants with unexplained fever. For girls and circumcised boys, the probability of UTI decreased to <2% if none of the following variables were present: temperature ≥39°C, no source of fever, non–African American (LR– 0.05; 95% CI, 0-0.79).16 In addition, a DUTY clean-catch or diaper pad score ≥5 points was useful as a red flag (LR+ 9.55, 95% CI, 7.14-12.78 and LR+ 4.13, 95% CI, 2.91-5.87, respectively).15 The Yale Observation Scale and the National Institute for Health and Care Excellence traffic light system were not useful for ruling in or out UTI in children aged <3 months or <6 years, respectively.42,52
Original ROC curve analysis of clinical prediction rules for urinary tract infections.
CC = clean-catch urine samples; D = diaper urine samples; DUTY = Diagnosis of Urinary Tract Infection in Young Children; NICE = National Institute for Health and Care Excellence; p = points; ROC = receiver operating characteristic; UTI = urinary tract infection; UTIcalc = UTI Calculator; var = variable.
Note: ROC curve analysis showing sensitivity vs 1-specificity at each threshold. The cutoff for a positive rule is shown next to each point on the graph. Each symbol represents the diagnostic test accuracy of 1 prediction rule for urinary tract infection in children.
a Derivation studies.
b Score ≥6 on Clinical Global Impression – Severity scale (0-10).
Additional Analyses
Subgroup analyses via metaregression were only possible for ethnicity and sex. For other signs and symptoms with variable LR values (Supplemental Appendix 8, https://www.AnnFamMed.org/lookup/suppl/doi:10.1370/afm.2684/-/DC1), we performed sensitivity analyses to explore the effect of age, setting, inclusion criteria, and reference standard (Supplemental Appendix 9, https://www.AnnFamMed.org/lookup/suppl/doi:10.1370/afm.2684/-/DC1). Exclusion of studies of children aged >5 years, those with a suboptimal reference standard (positivity threshold lower than recommended or urine collection method unclear), or those in emergency department or non–emergency department settings did not affect LR values. With regard to malodorous urine, exclusion of 1 retrospective study16 gave an LR+ of 2.90 (95% CI, 1.61-5.22; n = 3) in children aged <5 years, assuming this feature might be considered an amber sign instead of a red flag on the basis of higher-quality studies.
DISCUSSION
Summary
Only 3 features (circumcision, stridor, and diaper rash) appeared to decrease the probability of UTI sufficiently, whereas cloudy or malodorous urine, hematuria, no fluid intake, suprapubic tenderness, and loin tenderness could be useful as red flags. Urgency, frequency, dysuria, bed wetting, and history of UTI moderately increased the probability of UTI in children.
Guidelines recommend obtaining a urine sample from children with unexplained fever or other symptoms suggestive of UTI. The present study suggests that this sampling strategy might be inadequate because only few clinical features increased or decreased the likelihood of UTI in children, and the absence of unexplained fever did not rule out UTI (LR– 0.35).
Combining signs and symptoms in a clinical prediction rule, such as with the UTIcalc, DUTY clean-catch, or Gorelick Scale score, might be more accurate to rule out UTI in ambulatory care; however, they require a greater proportion of children to be tested with urine sampling compared to current guidelines.6-8 Furthermore, these decision rules should be validated externally.
Strengths and Limitations
The main strengths of the present study were the comprehensive search strategy and the meta-analysis taking into account heterogeneity and multiple thresholds.
We observed between-study heterogeneity, which could be seen as a limitation of the study. Most studies reporting on cloudy and malodorous urine are from high-prevalence settings (>20%). Only 1 study from a low-prevalence setting (2%) reported an LR+ of 2.19 (95% CI, 1.01-4.74) and an LR+ of 3.70 (95% CI, 3.03-4.51) for cloudy and malodorous urine, respectively.15 With regard to dysuria, abdominal pain, and no source of fever, the LR values varied between studies, and sensitivity analyses revealed no effect of age, setting, or inclusion criteria.
Comparison With Existing Knowledge
A systematic review by Whiting et al2 identified 6 studies describing the accuracy of 5 clinical features and the Gorelick Scale score. Hay et al15,54 provided updates in 2011 and 2016, including 13 ambulatory care studies. We further explored the summary estimates for 58 clinical features in total and found 5 additional prediction rules, of which 3 might be useful for ruling out UTI.
A systematic review and meta-analysis by Shaikh et al12 found 12 studies. They suggested that body temperature ≥40°C might be useful as a red flag (LR+ range 3.2-3.3; n = 2). We found 12 additional studies, which we included in a hierarchical summary receiver operating characteristic analysis, providing results at variable thresholds.
Narrative reviews described abdominal pain, irritability, and vomiting as important features; however, our results showed that these symptoms might have no value in children (Table 1, Supplemental Appendix 6).9,10,55
Implications for Practice
The present study suggests that broader sampling strategies might be more appropriate to identify pediatric UTI at an early stage. Novel urine collection methods and reliable tests are urgently needed for infants and children to allow for the ruling out of UTI in ambulatory care. In countries where the circumcision rate is high among boys, the presence of circumcision can aid in ruling out UTI if no other UTI features are present. The Yale Observation Scale and National Institute for Health and Care Excellence traffic light system should not be used for UTI.
New studies in which urine is collected systematically from acutely ill children in ambulatory care should also focus on validating existing prediction rules to rule out UTI and to define which children require urine sampling.
In conclusion, the present meta-analyses confirm that few clinical features are useful for diagnosing or ruling out UTI without further urine analysis. Signs and symptoms combined in a clinical prediction rule, such as with the DUTY or UTIcalc score, might increase accuracy for ruling out UTI; however, these should be validated externally. Urine sampling should not be restricted to children with unexplained fever or UTI features and should be applied more broadly in ambulatory care, given that appropriate sampling techniques are available.
Acknowledgments
The authors wish to thank Thomas Vandendriessche, Eline Vancoppenolle and Krizia Tuand, the biomedical reference librarians of the KU Leuven Libraries – 2Bergen – Learning Centre Désiré Collen (Leuven, Belgium), for their help in conducting the systematic literature search. Additionally, we would like to thank Dr Schwarzer for providing statistical advice and authors for sharing nonpublished data: Professor Dr Hay, Professor Dr Shaikh, Professor Dr Waterfield, Dr Velasco, Professor Dr Oostenbrink, Professor Dr Andreola, Dr Bressan, and Dr Hildenwall.
Footnotes
Conflicts of interest: authors report none.
To read or post commentaries in response to this article, go to https://www.AnnFamMed.org/content/19/5/437/tab-e-letters.
Author contributions: H.B., J.V., T.S., D.B., and A.VdB. designed the review. J.V. coordinated the review. H.B., T.S., and J.V. designed the search strategy. H.B. undertook the searches, and H.B., J.V., T.S., A.VdB., and A.G. screened the search results against eligibility criteria. J.V. and A.VdB. resolved conflicts. H.B. and A.G. appraised quality. H.B. and A.G. extracted the data, and H.B. analyzed the data. H.B., T.S., A.VdB., J.V., and D.B. interpreted the data. D.B. provided general advice on the review. J.V. secured funding for the review. J.V. is the guarantor of this manuscript. All authors reviewed the final manuscript. H.B. attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.
Funding support: This study did not receive any specific funding. H.B. is funded by a Katholieke Universiteit Leuven Starting Grant (No. ERX-D5331-STG/18/008). D.B. is a recipient of a Senior Clinical Investigator Fellowship from the Research Foundation Flanders (FWO). The financial sponsor played no role in the design, execution, analysis, or interpretation of the data or in the writing of the study.
Disclaimer: J.V. affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.
Data sharing: Full data sets can be obtained from the corresponding author at hanne.boon{at}kuleuven.be.
Previous presentation: General Practice Research on Infections Network meeting 2019; September 25-26, 2019; Leuven, Belgium.
↵* References 16, 17, 23-25, 27, 29, 30, 32, 33, 36, 39, 40, 42-47, 49-53.
↵† References 4, 15-17, 24, 25, 27-30, 35-37, 39, 40, 42-46, 50-53.
↵‡ References 16, 27-29, 35-37, 40, 42-44, 53.
↵* References 4, 16, 27-29, 33, 35-37, 39, 42-44, 46, 47, 49.
Supplemental materials: Available at https://www.AnnFamMed.org/lookup/suppl/doi:10.1370/afm.2684/-DC1.
- Received for publication July 24, 2020.
- Revision received November 23, 2020.
- Accepted for publication December 3, 2020.
- © 2021 Annals of Family Medicine, Inc.