Abstract
PURPOSE Whereas a diagnosis of acute uncomplicated urinary tract infection (UTI) in clinical practice comprises a battery of several diagnostic tests, these tests are often studied separately (in isolation from other test results). We wanted to determine the value of history and urine tests for diagnosis of uncomplicated UTIs, taking into account their mutual dependencies and information from preceding tests.
METHODS Women with painful and/or frequent micturition answered questions about their signs and symptoms (history) of UTIs and underwent urine tests. A culture was the reference standard (103 colony-forming units per milliliter). A diagnostic index was derived using logistic regression with bootstrapped backward selection and parameter-wise shrinkage. Risk thresholds for UTI of 30% and 70% were used to analyze discriminative properties. Six models were compared: (1) history only, (2) history+ urine dipstick, (3) history+ urine dipstick + urinary sediment, (4) history+ urine dipstick+ dipslide, and (5) history+ urine dipstick+ urinary sediment+ dipslide; we then added (6) a test only for patients with an intermediate risk (between 30% and 70%) after the preceding test.
RESULTS One hundred ninety-six women were included (UTI prevalence 61%). Seven variables were selected from history (3), dipstick (2), sediment (1), and dipslide (1). History correctly classified 56% of patients as having a UTI risk of either <30% or >70%. History and urine dipstick raised this to 73%. The 3 models with the addition of urinary sediment and dipslide, separately and in combination, performed hardly better. The sixth model, in which those at intermediate risk after history and received an additional test, correctly classified 83%. The patient’s suspicion of a UTI and a positive nitrite test were the strongest indicators of a UTI.
CONCLUSIONS Most women with painful and/or frequent micturition can be correctly classified as having either a low or a high risk of UTI by asking 3 questions: Does the patient think she has a UTI? Is there at least considerable pain on micturition? Is there vaginal irritation? Other women require additional urine dipstick investigation. Sediment and dipslide have little added value. External validation of these recommendations is required before they are implemented in practice.
INTRODUCTION
Acute uncomplicated urinary tract infections (UTIs) are infections of the lower urinary tract in healthy, nonpregnant, adult women. The diagnosis is made by the presence of urinary symptoms in combination with bacteriuria.1–4 Sixty percent of all women experience at least 1 UTI during their life.5 The symptoms are bothersome and have a negative impact on quality of life.6–9 Although empiric antibiotic treatment of all women with urinary symptoms has been reported to be cost-effective,10,11 bacterial resistance is rising,12–15 and an accurate diagnosis of UTI is needed to facilitate a well-targeted use of antibiotics.
Various medical history questions and urine investigations can be used for UTI diagnosis, of which nitrite, blood, and leukocyte esterase urine dipstick tests, microscopic examination of the urinary sediment, and dipslide are the ones most widely applied. Most of these diagnostic indicators have been studied in single-test evaluations, implying that a test is compared with the urine culture without taking into account the results of preceding tests, including clinical history.16–19 In clinical practice, however, the diagnostic work-up is multivariable, and test results are mutually dependent.20–24 For example, the diagnostic value of the dipslide used in isolation may be considerable, but it might not add much once the clinical history and nitrite test result are known. As a result, performance of expensive and time-consuming tests, such as urinary sediment and dipslide, may not always be needed for an accurate diagnosis.
Whereas most previous research focused on single-test evaluations, the aim of our study was to determine the added diagnostic value of indicators from patient and symptoms and urinalysis, taking into account their mutual dependencies and information from prior tests. Using our results, we present an efficient, easy-to-use diagnostic rule, consisting of a limited number of tests while preserving diagnostic accuracy.
METHODS
Participants
We recruited patients from April 18, 2006, until October 8, 2008, into a cross-sectional study of 20 general practices in and around Amsterdam, The Netherlands. Female patients older than 12 years who contacted their family physician with painful and/or frequent micturition were eligible. Their symptoms had to be present for no longer than 7 days. Exclusion criteria were pregnancy, lactation, signs of pyelonephritis, having used antibiotics or having undergone a urological procedure in the past 2 weeks, known anatomic or functional abnormalities of the urogenital tract, and being immunocompromised (with the exception of diabetes mellitus).
Assessments
Included patients completed a detailed questionnaire to record presence and severity of signs and symptoms (history) on a 4-point scale, and they collected a urine sample at the physician visit. In line with the national guideline of the Dutch College of General Practitioners,3 no instructions for the collection method were given, because the method of collection has been reported to have no effect on the extent of contamination.25–27
The family physician or medical assistant performed a urine dipstick test (Multistix 5, Siemens Medical Solutions Diagnostics) using a Clinitek Status analyzer (Siemens Medical Solutions Diagnostics), as well as a dipslide test (Uricult classic, Orion diagnostica) according to the manufacturer’s instructions. The results of these 2 tests were recorded on a standardized registration form.
Immediately after the urine dipstick and dipslide investigations had been performed, urine samples were stored in a refrigerator. Within 8 hours, a specialized courier service (Ruwiel Labexpress, Kockengen, The Netherlands) transported the samples at 4°C to the Academic Medical Center in Amsterdam. A urinary sediment investigation was performed at the Laboratory for Clinical Chemistry, and the urine was cultured at the Laboratory for Medical Microbiology by trained laboratory technicians who had no knowledge of previous test results.
All recorded test results were entered into a structured database by a qualified data entry service (Service Point Nederland BV).
Statistical Analysis
We determined that if the 5 events per variable rule was applied to a number of 20 possible diagnostic indicators, we would need about 100 women with a UTI.28
Missing values were imputed using multiple imputation by chained equations.29,30 Logistic regression with bootstrapped backward elimination was used to select a parsimonious set of variables (P remove .05, bootstrap inclusion fraction ≥66.67%; see the Supplemental Appendix, available at http://annfammed.org/content/11/5/442/suppl/DC1 for details). The result of the urine culture was the binary dependent variable, with more than 103 colony-forming units (CFUs) of a single uropathogen per milliliter (mL) being defined as a positive culture according to international guidelines.31 Different indicators from the same diagnostic medium (eg, nitrite and leukocyte esterase from urine dipstick, or bacteria and leukocytes in sediment) were analyzed as separate variables. Because we hypothesized that women who thought they had a UTI might have a higher UTI probability if they had experienced a (proven) UTI in the past, we entered 2 interaction variables into the backward elimination procedure: the combination of a woman thinking she had a UTI and the number of UTIs she reported to have experienced in the past year, as well as the combination of a woman thinking she had a UTI and reporting to have had at least 1 UTI ever diagnosed by a physician.
Using the selected variables, that is, those retained after backward elimination, we composed 5 different models: (1) history only, (2) history + dipstick, (3) history+dipstick +sediment, (4) history +dipstick +dipslide, and (5) history+dipstick +sediment +dipslide.
We performed parameter-wise shrinkage of the obtained regression coefficients to correct for possible overoptimism.32 Using the shrunk regression coefficients, predicted risks were calculated for the 5 different models. Based on 2 independent polls of 150 Dutch family physicians, we considered predicted risks below 30% and above 70% to be clinically relevant. These 2 risk thresholds were used to compare the diagnostic performances of the different models. In addition, we repeated the analysis for risk thresholds of 20% and 80%.
We also composed a sixth model in which a diagnostic step (eg, history, dipstick, sediment, and dipslide) was performed only for those patients who remained in the intermediate risk category (between 30% and 70%) after the preceding step.
For the most informative models, the shrunk regression coefficients were used to compose clinical scores.33
Because urinalysis results might be influenced by bladder incubation time,34 we asked participating patients whether they had been urinating within 4 hours before urine collection and explored whether the diagnostic performance of our model was affected by correcting for this possibility.
Analyses were performed in Stata/SE 10.1 (StataCorp LP).
The study procedure was approved of by the Medical Research Ethics Committee of the Academic Medical Center in Amsterdam. Participating women received a letter with information about the study and provided written informed consent. For patients younger than 18 years, written parental authorization was obtained.
RESULTS
A total number of 205 women were included in the study. Because 92 (45%) had data missing for at least 1 variable, we created 45 imputed data sets.30,35 No urine culture was available for 9 patients; these patients were dropped from the analysis after multiple imputations,36 and the resulting 196 patients were used for the analysis.
The patients’ main history characteristics are displayed in Table 1. Their mean age was 43 years (range = 16 to 89 years); the mean age did not differ between patients with and those without a positive culture. The characteristics of their urinalyses are displayed in Table 2. The prevalence of a positive culture (≥103 CFU/mL of a uropathogen) was 61% (120/196). Of the 120 positive cultures, 115 had a single uropathogen, and 5 had mixed flora, all of which contained ≥103 CFU/mL of the primary uropathogen Escherichia coli. Of the 115 single uropathogens, 4 were secondary pathogens (1 had between 103 and 104 CFU/mL and 3 ≥104 CFU/mL).
Seven variables were retained after bootstrapped backward elimination: 3 history variables (having at least considerable pain during micturition, having any vaginal irritation, suspecting a UTI); 2 dipstick variables (nitrite positive, blood ≥1+); 1 sediment variable (>20 leukocytes per high-power field [HPF]); and 1 dipslide variable (cystine lactose electrolyte deficient [CLED] medium ≥105 CFU/mL). These 7 variables were used to compose the 6 predefined models.
The 2 interaction variables—the combination of suspecting a UTI and reported number of UTIs in the past year and the combination of suspecting a UTI and reporting at least 1 UTI ever diagnosed by a physician—were not retained (bootstrap inclusion fractions 37.49% and 21.01%, respectively).
Figure 1 shows the discriminative performances of models that were based on analyses for all 196 included patients. After applying the history-only model, 28 patients (14%) had a less than 30% predicted risk and 81 patients (41%) had a greater than 70% predicted risk of UTI. Observed risks were 0.14 and 0.81, respectively. After applying the history and dipstick model, 68 patients (35%) had a less than 30% predicted risk, and 76 patients (39%) had a greater than 70% predicted risk of UTI. Observed risks were 0.21 and 0.95, respectively. Of the 81 patients who had a greater than 70% predicted risk after applying the history-only model, 35 were classified below that threshold by the history + dipstick model because of a negative nitrite test result, 24 of whom had a positive culture (observed risk 0.68). These 24 patients were correctly classified with the history + dipstick + sediment model, the history + dipstick + dipslide model, or the history + dipstick + sediment + dipslide model.
Figure 2 shows the discriminative performances of the model in which a diagnostic step (eg, history + dipstick + sediment or dipslide) was performed only for those patients who remained in the intermediate risk category (between 30% and 70%) after the preceding step. After performance of history and a urine dipstick, 62 patients (32%) had a predicted risk of less than 30%, 100 patients (51%) had a predicted risk of greater than 70%, and 34 patients (17%) remained in the intermediate risk category because of the combination of a negative nitrite test and a positive blood test. Subsequent performance of either a sediment or a dipslide test reclassified 5% or 7% of patients, respectively, from the intermediate category into the high-risk category (greater than 70%) because of a positive test result. All predicted risks were close to the observed risks, indicating good calibration of all models. Models that included history plus dipstick, sediment, and dipslide, individually and in combination, however, calibrated better than the model intermediate risk, implying a lower rate of false positives and false negatives in the highest and lowest risk category, respectively.
Because we considered history-only or history + dipstick models, the clinically best applicable models for all patients, we used their regression coefficients to compose clinical scores. Table 3 displays the odds ratios, coefficients, and clinical scores for the variables in these 2 models, and Table 4 displays their predicted risks within each predefined risk category (less than 30%, 30% to 70%, and greater than 70%). Reporting any vaginal irritation reduced the probability of a UTI, whereas presence of the other 6 history and dipstick indicators increased this probability. Suspecting a UTI and a positive nitrite test were the strongest indicators of a positive culture. In contrast with history only, having at least considerable pain during micturition did not add any value to the history+dipstick model.
We repeated the analysis for predicted risk thresholds of 20% and 80%, which assigned 31% of patients (60 of 196) to either the high- or the low-risk category after history only (compared with 56% (109 of 196) for thresholds of 30% and 70%), without yielding better calibration (proportion of women with UTI in the high-risk category and women without UTI in the low-risk category).
Bladder incubation time was 4 or more hours in 26 patients, of whom 20 had a positive culture. Omitting these patients from the overall analysis did not affect the performance of our models.
DISCUSSION
Asking 3 questions (Does the patient think she has a UTI? Is there at least considerable pain on micturition? Is there vaginal irritation?) may be sufficient to correctly classify more than one-half of women with painful and/or frequent micturition as having a UTI risk of either less than 30% or greater than 70%. Subsequent performance of 2 urine dipstick tests (nitrite and blood) raises this proportion to 73% (Figure 1). This percentage rises to 83% if a urine dipstick is performed only for patients with a UTI risk between 30% and 70% after history (Figure 2) and avoids the possibility of a false-negative nitrite test in patients with a high UTI risk (greater than 70%) after history. In our sample, the proportion of false-negative dipstick results among women with a high UTI risk after history was 69% (24/35).
More than 10% of patients will not be classified as having a probability of UTI of less than 30% or greater than 70%, even if all available tests are performed (Figure 1), mainly because of a combined negative nitrite test and a positive blood test on urine dipstick investigation. These women may require a urine culture and, in the case of a negative culture and positive blood test, follow-up of their hematuria.37
Suspecting a UTI and a positive nitrite test are the strongest indicators of a positive culture. We found that a combination of suspecting a UTI and reported number of UTIs in the past year or a combination of suspecting a UTI and reporting at least 1 UTI ever diagnosed by a physician did not contain any additional information compared with the cumulative yield of the separate indicators (that is, no statistical interaction). In other words, the likelihood of women who think they have a UTI does not depend on their previous experiences with UTIs.
The diagnostic performances of UTI indicators were previously described.16,38,39 In contrast with Bent et al, who did not take into account the mutual dependencies of the diagnostic indicators16 and therefore may have overestimated their predictive values, Little et al and McIsaac et al described multivariable analyses similar to ours.38,39 Nevertheless, there are some methodological differences between these studies and our study. First, they included women in whom family physicians suspected a UTI, based on their personal judgment, whereas we used clearly formulated eligibility criteria to better ensure generalizability. Second, they transported urine samples before cultures were -made, whereas we refrigerated our samples until cultures were made to assure reliability of the reference standard. Third, they did not report how missing values were handled, whereas we gave an extensive description of our multiple imputation method. Fourth, as did Bent et al, they did not assess diagnostic values of sediment and dipslide investigations, whereas we considered evaluation of these investigations an essential part of our analysis and showed their limited added value empirically. Finally, we investigated the scenario of performing a test only for women with an intermediate UTI risk after the preceding test (that is, women with a predicted risk between 30% and 70%). Although we consider this scenario of substantial clinical relevance, we did not present it as our main result because of small patient numbers and the resulting risk of unstable estimates.
To avoid overfitting by using too many candidate predictors, we could not include all available variables in our analysis; we therefore selected the 22 variables we considered the most relevant based on literature and clinical usefulness (Supplemental Appendix). This number was in accordance with the 5 events per variable rule,28 as we included 120 patients with a positive culture.
Limitations
Like in any prediction model, our results may be overoptimistic for the population in which the model was developed. As a result, it may perform less well in a different population. Although we restricted the number of candidate indicators to 22 and performed bootstrapped selection and parameter-wise shrinkage, external validation of the model in a different population is indicated.
The diagnostic indicator of having at least considerable pain during micturition might have a cultural dimension that is due to differences in pain experiences between different subcultures. This possibility should be considered when our model is applied to individual patients in clinical practice.
To compare the diagnostic performances of the different models in a way that is attractive to clinicians, we used classification into predicted risk categories instead of traditional performance measures, such as receiver operating characteristic areas, which may be hard to apply in daily practice.40 Because there are no widely accepted methods to define these risk categories, we performed 2 independent polls of 150 Dutch family physicians and chose to use predicted risk thresholds of less than 30% and greater than 70% based on the results of those polls. Nevertheless, we repeated the analysis for predicted risks of less than 20% and greater than 80%. Although using these more extreme risk thresholds may seem more accurate to either detect or rule out UTI, it did not change the observed risks in either the high- or the low-risk category. Moreover, it resulted in worse discrimination: whereas application of the 30% and 70% thresholds assigned 56% of patients to either the high- or the low-risk category after history only, application of the 20% and 80% thresholds did so for only 31%. Because of the better discriminative properties and the results of the 2 polls, we decided to present the results of the 30%–70% instead of the 20%–80% thresholds.
An uncommon but severe complication of UTIs is pyelonephritis, which may be a reason to treat women at a low probability of UTI. Placebo arms of randomized controlled trials suggest, however, that cystitis seldom progresses to pyelonephritis.41–43 No patients consistently developed pyelonephritis in our own study population during the week after diagnosis (95% CI, 0%–2%).
An important issue in UTI diagnosis is the cutoff value to be used for defining a culture as being positive. In practice, many clinicians and medical microbiologists still use the traditional cutoff value of 105 CFU/mL as described by Kass in a study on asymptomatic women and women with pyelonephritis.44 For symptomatic women, however, the most recent international guidelines recommend a cutoff value of 103 CFU/mL.31 Because we think that a reference standard in diagnostic research should be based on scientific evidence rather than on practical arguments, we chose to use a cutoff value of 103 CFU/mL as the dependent variable in our analysis; however, we performed a separate analysis using a cutoff value of 105 CFU/mL value as the dependent variable, which yielded the same diagnostic indicators and values, with the exception of vaginal irritation, which was not selected.
According to the previously mentioned guideline,31 the cutoff value of ≥103 CFU/mL applies only to primary pathogens. For secondary and doubtful pathogens, different cutoff values are recommended (≥104 and ≥105 CFU/mL, respectively). We found only 4 secondary pathogens (of which 1 was between ≥103 and ≥104 CFU/mL, and 3 of which were ≥104 CFU/mL) and no doubtful pathogens. As a result, choosing a different cutoff value for these pathogens would not affect our findings.
A factor that might influence urinalysis results is bladder incubation time, which should be preferably 4 or more hours.34 Because we did not consider incubation time to be a potential diagnostic indicator and because we chose to stay close to common practice (in which bladder incubation is generally hard to determine), we did not include it as a variable in our analysis. We did explore, however, whether the diagnostic performance of our model was affected by correcting for bladder incubation time, which was not the case.
Our findings imply that UTI diagnosis may be simplified by considerably reducing the number of questions and urine investigations needed. In more than one-half of women complaining of painful and/or frequent micturition, the diagnostic procedure may be limited to asking 3 simple questions (Does the patient think she has a UTI? Is there at least considerable pain on Micturition? Is there vaginal irritation?), meaning that it might be completed by telephone in these cases. Additional performance of a nitrite and blood dipstick seems sufficient to make an accurate diagnosis for most patients. Urinary sediment and dipslide appear to add little information to what is already known from history and dipstick results, implying that performance of these expensive, time-consuming tests might be abandoned.
Acknowledgments
We thank Lucas Bachmann (Horten Centre, University of Zurich), Patrick Bindels (Erasmus MC Rotterdam, Department of General Practice), Thierry Christiaens (Ghent University, Department of General Practice and Primary Health Care), Bart van Pinxteren (Dutch College of General Practitioners), Ellen Stobberingh (University Hospital Maastricht, Department of Medical Microbiology) and Koos Zwinderman (Academic Medical Center, University of Amsterdam, Department of Clinical Epidemiology and Biostatistics) for their advice on the study design.
We also thank the following family physicians and health centers for their participation in patient recruitment: Mr RH Dijkstra, Gezondheidscentrum Diemen-Noord, Gezondheidscentrum Diemen-Zuid, Gezondheidscentrum Gein, Gezondheidscentrum Holendrecht-Noord, Gezondheidscentrum Holendrecht-Zuid, Gezondheidscentrum Klein Gooioord, Gezondheidscentrum Nellestein, Gezondheidscentrum Reigersbos, Huisartsen Monnickendam, Huisartsenpraktijk Badhoevedorp, Huisartsenpraktijk Bouwman, Huisartsenpraktijk HB Burggraaff, Huisartsenpraktijk De Wagenmaker, Huisartsenpraktijk Loenermark 162–164, Huisartsenpraktijk Oude Turfmarkt, Huisartsenpraktijk Purmerend, Mr HC Völke, Mr N Wieringa, and Mrs M Wieringa- de Waard.
Footnotes
-
Conflicts of interest: authors report none.
- Received for publication July 27, 2012.
- Revision received November 6, 2012.
- Accepted for publication December 13, 2013.
- © 2013 Annals of Family Medicine, Inc.