|
|
||||||||
1 Department of Family Medicine, University of Wisconsin, Madison, Wisc
2 School of Medicine, University of Wisconsin, Madison Wisc
3 Provincial Health Services Authority, and Department of Family Practice, University of British Columbia, Vancouver, British Columbia, Canada
CORRESPONDING AUTHOR: Bruce Barrett, MD, PhD, Department of Family Medicine, University of Wisconsin, 777 South Mills, WI 53715, bruce.barrett{at}fammed.wisc.edu
| ABSTRACT |
|---|
|
|
|---|
METHODS Benefit-harm tradeoff interviews (in-person and telephone) assessed SID in terms of overall severity reduction using evidence-based simple-language scenarios for 4 common cold treatments: vitamin C, the herbal medicine echinacea, zinc lozenges, and the unlicensed antiviral pleconaril.
RESULTS Response patterns to the 4 scenarios in the telephone and in-person samples were not statistically distinguishable and were merged for most analyses. The scenario based on vitamin C led to a mean SID of 25% (95% confidence interval [CI] 0.230.27). For the echinacea-based scenario, mean SID was 32% (95% CI, 0.300.34). For the zinc-based scenario, mean SID was 47% (95% CI, 0.430.51). The scenario based on preliminary antiviral trials provided a mean SID of 57% (95% CI, 0.530.61). Multivariate analyses suggested that (1) between-scenario differences were substantive and reproducible in the 2 samples, (2) presence or severity of illness did not predict SID, and (3) SID was not influenced by age, sex, tobacco use, ethnicity, income, or education. Despite consistencies supporting the model and methods, response patterns were diverse, with wide spreads of individual SID values within and among treatment scenarios.
CONCLUSIONS Depending on treatment specifics, people want an on-average 25% to 57% reduction in overall illness severity to justify costs and risks of popular cold treatments. Randomized trial evidence does not support benefits this large. This model and these methods should be further developed for use in other disease entities.
Key Words: Clinical significance common cold effect size important difference outcomes quality of life questionnaires respiratory tract infections evidence-based medicine health policy research quantitative methods randomized clinical trial
| INTRODUCTION |
|---|
|
|
|---|
Randomized trials are powered to detect specific effect sizes. The larger the number of participants, the smaller the effect size that can be detected. A trial that is too small may miss an effect size that might be important, whereas a larger trial might demonstrate an effect that is too small to be clinically significant. Following this rationale, many experts now agree that trials should be powered to detect a minimal important difference.13 This conceptual entity was first defined in 1989 to be "the smallest difference in score in the domain of interest which patients perceive as beneficial and which would mandate, in the absence of troubling side effects and excessive cost, a change in the patients management."4 Although an important addition to the theory and practice of evidence-based medicine, minimal important difference is limited in that it does not account for negatively valued consequences of medical interventions (harms), such as costs, side effects, and risks of adverse effects.
In 2005 we defined "sufficiently important difference" (SID) as "the smallest amount of patient-valued benefit that an intervention would require in order to justify associated costs, risks, and other harms."5 We consider SID to be very similar to the "smallest worthwhile effect" concept described elsewhere.69 The advantage of SID is that it extends the notion of minimal important difference, has an operational definition, and can be estimated using benefit-harm tradeoff methods. Using reduction of illness duration in the common cold as the desired benefit (outcome), we previously showed how benefit-harm tradeoffs could serve as a method of SID estimation.10 In this article, we report on a second sample of respondents and assess SID as overall severity reduction benefit.
| METHODS |
|---|
|
|
|---|
To be eligible for either arm of this study, prospective adult participants had to answer "yes" to the question, "Do you think that you have a cold or are coming down with a cold?" They also had to report at least 1 of 4 cold symptoms (sneezing, runny nose, nasal obstruction, or sore throat), and to have a total Jackson score of at least 2 points. Jackson scores1315 are simple sums of severity ratings (1 = mild, 2 = moderate, 3 = severe) for 8 symptoms: those noted above plus cough, headache, chilliness, and malaise. For prospective participants with eye or nose itching, sneezing, or history of allergy, the interviewer, as well as the participant, had to be convinced that the participants symptoms indicated a cold, not an allergic illness. Interviewers were also given permission to exclude potential participants whom they suspected were dishonest when reporting symptoms or whom they believed would not be competent to follow study protocol. These instances occurred rarely. Interviewers questioned prospective participants by telephone and then again in person using semistructured interview techniques aimed at stimulating recall to enhance accuracy of symptom reporting. All interviewers were trained and supervised by the senior author (B.B.), a family physician and anthropologist who had substantial experience with interview methods and patients with acute respiratory tract infection. The protocol was approved by the Institutional Review Board of the University of Wisconsin School of Medicine and Public Health.
Interviews began with the following statement:
We are trying to understand what people think about when making decisions about treating their colds. Specifically, were interested in how much benefit you would want to expect in exchange for the costs and possible side effects of a given treatment. Benefit can come in the form of reduced severity and/or decreased duration of illness. I would like to describe 4 different cold treatments, then have you tell me whether or not you would want to take these medicines, and why or why not.
Next, the participant was presented with 1 of the following scenarios:
A 10-cent vitamin pill must be taken 3 times daily for the first 3 days of your cold. There are no significant risks or side effects to this treatment. It is unlikely that the length of your cold would be reduced significantly. Severity of symptoms might be reduced by as much as 30%.
A 20-cent lozenge must be dissolved in the mouth every 2 to 3 hours while awake for the first 3 days of your cold. Side effects may include bad taste, and, very occasionally, nausea. It is possible that the length of the cold could be reduced slightly. Severity of symptoms might be reduced by as much as 30%.
A 50-cent dropperful of an herbal extract must be taken 3 times each day for the first 3 days of your cold. Side effects are limited to bad taste. It is possible that the length of the cold could be reduced slightly. Severity of symptoms might be reduced by as much as 30%.
A $2 prescription-only pill must be taken 3 times daily for the first 3 days of the cold. Side effects are unknown. Preliminary data suggests an average 24-hour reduction in the length of your cold. Severity of symptoms might be reduced by as much as 30%.
The scenarios were presented in varied order, so that each scenario had an approximately equal chance of being considered first, second, third, or last. After each scenario was presented, participants were asked, "Would you take this treatment?" and then, "Why?" or "Why not?" Brief notes were taken regarding the answers to these qualitative questions. Next, participants who had answered "yes" to the original question were asked: "Would you take this [treatment] if it were able to reduce severity by 20%?" If the answer was still "yes," the hypothetical severity reduction was lowered to "10%," then if still "yes," it was lowered to "5%," and, finally, "any?" If the original answer was "no," severity reduction benefit was increased to "40%," then if still "no," it was increased to "50%," then "75%." Severity reduction SID was defined as the smallest severity reduction that justified the treatment scenario for that participant.
The scenarios represent our interpretation of best evidence available when the study began. Although potential severity reduction benefits were varied beyond what many experts would consider reasonable (ie, a 75% overall severity reduction is unlikely), the initial scenarios were designed to be realistic. Toward this end, we reviewed trial reports and systematic reviews1619 and aimed for a brief scenario that was both easy to understand and evidence-based.
After benefit-harm trade-off interviews were completed, we gathered descriptive data, including age, sex, tobacco use, ethnicity, income, and educational achievement. Responses were scored on paper forms by the interviewer for telephone interviews, and directly on paper by the participant for the in-person interviews. Data were entered twice, then cross-checked, with discrepancies resolved by comparison with paper sheets. Statistical analysis began with tabular and graphical inspection, followed by assessment of missing data and outliers. We then proceeded to correlation matrices, analysis of variance, and multivariate regression models. We assessed potential relationships of demographic variables (age, sex, smoking, ethnicity, education, income) and Jackson severity scores with the primary SID severity reduction outcome variable. Whereas data representing SID were ordinal in nature, the underlying SID domain was considered to be continuous. We did not assume that SID distributions would be normal and used nonparametric as well as parametric methods in our analysis strategy. Multivariate models were developed using PROC MIXED in SAS 9.1.3 (SAS Institute, Cary, NC, 2001). These models assessed within-person effects assuming a compound symmetry variance matrix.
| RESULTS |
|---|
|
|
|---|
Of the 253 participants enrolled in this study, 162 completed a single benefit-harm trade-off interview by telephone. The remaining 91 were enrolled in person within 48 hours of first cold symptom, were followed up with daily reports until their cold had resolved, and then were interviewed again in person using the same questions they were asked at intake. Thus, the database for this report includes data from 162 telephone and 182 in-person interviews, representing SID values for the 253 participants.
Table 1
shows that our sampled population was reasonably diverse in terms of income and education, but mostly female (67.8%) and mostly white (68.4%). Jackson scores were similar for telephone interviews (mean 9.6; SD 3.7) and the intake interviews (mean 9.7; SD 3.6). Demographic measures were collected for all prospective participants, but for only 117 of 162 participants responding by telephone because of interviewer error during the first few weeks of the study. To calculate indicators of central tendency and variability for the SID variable, "any" and "never" responses were conservatively assigned values of 5% and 88%, respectively.
|
|
|
Scatter plots, correlation matrices, and simple linear regressions were used to look for potential relationships between SID and the covariates of age, sex, ethnicity, education, household income, Jackson severity score, and nature of interview (telephone, intake, exit). Next, multivariate regression equations were constructed to account for potential covariate influence on SID values. Very little in the way of statistically significant associations were found. Considering multiple comparisons, those associations that were found could be due to chance (Table 3
). In both bivariate and full multivariate models, sex was significantly associated with SID for lozenges (coefficient = 0.09; P < .05), but not for other treatments. Higher education was significantly associated with lower SID for vitamins in bivariate analysis (fixed-effect F score = 2.98; P <.05) and in the multivariate model (fixed-effect F score = 2.57; P<.05), but there was no clear dose-response relationship. Education was not significant in other estimates of SID. No other association reached statistical significance.
|
| DISCUSSION |
|---|
|
|
|---|
In both studies, heterogeneity characterized both between-person and between-scenario distributions. For example, although many participants would accept a treatment for small benefit (10% or less), many others required larger benefits (50% or greater). Similar between-person differences were seen in the first study, with many saying they would take treatments for small duration reductions (12 hours or less for a 6-day cold), but many others indicating the need for larger benefits (36 hours or more).10 Despite this heterogeneity, specific treatment scenarios yielded distinctive response distributions. In the current study, for example, mean severity-reduction SID for the prescription antiviral drug scenario (57%) was more than twice that resulting from the vitamin C scenario (25%). This between-scenario difference was even more pronounced in the first study, where the mean SID reduction in length of a 6-day cold was 83 hours (57%) for the prescription pill scenario, but just 26 hours (18%) for the vitamin.10
Qualitative responses in the 2 studies were also similar and helped make sense of response patterns. Participant responses suggested that between-scenario differences were partially due to negatively valued costs and risks. For the prescription pill scenario, responses were influenced by the implication that potential side effects were not well known. Additionally, the need to see a physician was reported as a barrier by some participants. Regarding the lozenge scenario, several participant responses suggested that the use of the word "nausea" negatively influenced responses. Overall, we interpret these findings to support the SID concept and the benefit-harm trade-off method. Effect size in and of itself is not sufficient to justify treatment. Instead, expected benefits should be interpreted within the specific context of associated costs and risks.
Another consistent finding, and one more pronounced in the first study, was the trimodal nature of the SID data distribution. Some people will take a treatment regardless of the benefit under consideration. Some will not, even when hypothetical benefits are large. The majority, however, require a certain amount of benefit to justify costs and risks, which makes sense and fits with both clinical experience and psychological theory. People are (somewhat) rational, and make choices based on perceived likelihood and magnitude of both positively and negatively valued outcomes, but since different people value health-related domains differently, there is diversity in SID magnitudes across populations.
When benefits and harms are made explicit and portrayed in simple language, population distributions of SID are characteristic, reproducible, and largely unaffected by age, sex, ethnicity, income, education, and severity of illness at time of interview. By including 2 sets of participants (in-person and telephone) with varying degrees of illness severity, we were able to show that SID value judgments are influenced by positive and negative aspects of treatments, but not by demographic characteristics, and, in this study, not by severity of illness at time of interview. We must caution that this finding may not generalize to other illness conditions. Indeed, we suspect that preference patterns may be more stable for common cold than for other disease entities, as most people have experienced numerous colds and thus have had a chance to solidify their expectations, values, and preferences regarding cold treatments.
It is relatively clear that existing cold treatments do not provide the SID desired by our participants. Although space does not permit a review of available evidence,2022 the existence of any benefit is controversial for most, if not all, cold treatments. Even the more optimistic interpretations of existing evidence fall short of supporting overall severity reductions in the 25% to 57% range indicated as necessary by our participants.
This conclusion, however, may be trivial compared with the implications should similar results be found for other medical treatments, where effects sizes are usually quite modest. For example, best evidence suggests that a person with mild or moderate depression might expect a 1- or 2-point reduction on the Hamilton Depression Rating Scale, where 20 points is considered indicative of depression.2325 For Alzheimer-type dementia, where 100 points may be expected on the Alzheimer Disease Assessment Scale cognitive function subscale, a 1- to 3-point improvement may be attributable to cholinesterase inhibitors.2628 For both classes of medication, monetary costs are substantial and side effects are common. If patients (or their families or caregivers) were provided simple descriptions of likely benefits and harms, how many would choose these treatments? Of the millions currently taking these medicines, how many have been provided an accurate description of expected benefits, costs, and risk of harm?
In our opinion, there have been too few investigations into health values in general and into the nature of clinical significance in particular, which is unfortunate, as medical decision making, as well as trial design and interpretation, is inextricably linked to these conceptual entities. The introduction and development of minimal important difference were milestones, as benefits reaching the minimal important difference threshold must not only be of statistical significance but must also be recognized and valued by patients. The concept of a SID raises the bar another notch, as effect size meeting a SID must also be sufficient to justify costs and risks. Finally, we would like to note that the benefit-harm trade-off methods portrayed here and previously5,10 are only one method of estimating SID. Others will surely be invented, as SID (smallest worthwhile effect69) serves as both the effect size for which trials should be powered and as the benchmark by which they should be judged.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Funding support: This work was made possible by a career development grant from the Robert Wood Johnson Foundation Generalist Physician Faculty Scholars Program.
Received for publication September 19, 2006. Revision received December 19, 2006. Accepted for publication December 26, 2006.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
K. C. Stange Is 'Clinical Inertia' Blaming Without Understanding? Are Competing Demands Excuses? Ann. Fam. Med, July 1, 2007; 5(4): 371 - 374. [Full Text] [PDF] |
||||
![]() |
K. C. Stange In This Issue: New Concepts for Diabetes and Chronic Disease Management Ann. Fam. Med, May 1, 2007; 5(3): 194 - 195. [Full Text] [PDF] |
||||
Read all TRACK Comments
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |