Abstract
PURPOSE The purpose of this study was to investigate whether antidepressants are more effective than placebo in the primary care setting, and whether there are differences between substance classes regarding efficacy and acceptability.
METHODS We conducted literature searches in MEDLINE, Embase, Cochrane Central Register of Controlled Trials (CENTRAL), and PsycINFO up to December 2013. Randomized trials in depressed adults treated by primary care physicians were included in the review. We performed both conventional pairwise meta-analysis and network meta-analysis combining direct and indirect evidence. Main outcome measures were response and study discontinuation due to adverse effects.
RESULTS A total of 66 studies with 15,161 patients met the inclusion criteria. In network meta-analysis, tricyclic and tetracyclic antidepressants (TCAs), selective serotonin reuptake inhibitors (SSRIs), a serotonin-noradrenaline reuptake inhibitor (SNRI; venlafaxine), a low-dose serotonin antagonist and reuptake inhibitor (SARI; trazodone) and hypericum extracts were found to be significantly superior to placebo, with estimated odds ratios between 1.69 and 2.03. There were no statistically significant differences between these drug classes. Reversible inhibitors of monoaminoxidase A (rMAO-As) and hypericum extracts were associated with significantly fewer dropouts because of adverse effects compared with TCAs, SSRIs, the SNRI, a noradrenaline reuptake inhibitor (NRI), and noradrenergic and specific serotonergic antidepressant agents (NaSSAs).
CONCLUSIONS Compared with other drugs, TCAs and SSRIs have the most solid evidence base for being effective in the primary care setting, but the effect size compared with placebo is relatively small. Further agents (hypericum, rMAO-As, SNRI, NRI, NaSSAs, SARI) showed some positive results, but limitations of the currently available evidence makes a clear recommendation on their place in clinical practice difficult.
INTRODUCTION
Epidemiological studies indicate that depressive disorders are highly prevalent in the general population worldwide.1 Most cases are seen and managed in primary care, and only a small proportion of these are referred to specialized care.2 Most research findings upon which treatment decisions are made, however, have involved patients cared for by mental health specialists.3 It is not fully clear whether the findings from trials in specialty settings can be generalized to primary care. There is some evidence suggesting that primary care patients with depressive disorders are less severely depressed,4 experience a milder course of illness,5 have a distinct symptom profile with more complaints of fatigue and somatic symptoms,6 and are more likely to have accompanying physical complaints7 than are patients referred to specialty mental health care. These differences could have an impact on guideline development and management of depression in primary care.
Pharmacological interventions are a cornerstone of antidepressant treatment,8 yet there is an ongoing debate as to whether their relatively small effects compared with placebo observed in clinical trials are clinically relevant.9,10 Meta-analyses restricted to primary care patients have been performed for some antidepressant drugs.11–14 Researchers conclude that these treatments are effective in primary care settings. It is not possible, however, to determine whether the available treatment options are comparable (ie, whether some treatments are superior to others in primary care). Traditional meta-analyses are restricted to the direct comparison of 2 interventions by pooling data only from trials with similar treatment arms. Network meta-analysis allows for the estimation of relative effects of interventions that have not been compared directly.15 We systematically reviewed randomized trials of pharmacological treatments of depression in primary care settings. We used conventional and network meta-analysis to investigate whether there is evidence that in the primary care setting antidepressants are more effective than placebo and whether there are differences in efficacy and acceptability between the various substance classes.
METHODS
Details of the methods have been described in our published protocol.16 We also reviewed trials on psychological interventions. Because trials of pharmacological and psychological interventions differ greatly regarding recruitment strategies, patients, control interventions and outcomes, these trials are analyzed separately and reported in a companion article published in this issue.17
Search Strategy and Study Selection
We searched MEDLINE, Embase, Cochrane Central Register of Controlled Trials (CENTRAL), and Psyc-INFO (main search June 2011, last update searches December 2013; see Supplemental Appendix, section 1, for the complete MEDLINE search strategy). We searched trial registries for unpublished and ongoing studies. In addition, we screened references from identified trials and published systematic reviews focusing on primary care studies of depression treatments11–14 for additional trials.
We included randomized controlled trials that compared drugs belonging to different pharmacological classes with one another or placebo in the treatment of adult patients having prevalent or incident unipolar depressive disorder. Patients had to be recruited from a primary care setting consisting of family physicians’ or general practitioners’ private practices, primary care clinics or networks, internists, or other nonpsychiatrists providing primary care in their respective countries. We excluded trials that recruited patients from community-based centers specializing in mental health care. Trials had to report results of at least 1 of the following outcomes: response to treatment, remission, mean score on a depression scale (posttreatment or change from baseline), frequency of adverse effects, or study discontinuation (for any reason or from adverse effects).
Four authors (K.L., K.S., S.J., and K.M.) reviewed all trials for screening selection and extraction. In the first screening 1 reviewer excluded clearly irrelevant records. In the second screening, 2 reviewers independently checked all remaining records against inclusion criteria. The full texts of articles were obtained for all records that were considered potentially relevant or unclear and were assessed formally for eligibility by at least 2 reviewers independently. Disagreements were resolved by discussion.
Data Extraction and Assessment of Risk of Bias
In the first extraction step, at least 2 reviewers independently extracted information on patients, methods, and results of all included studies using a pretested form. The Cochrane Collaboration’s tool for assessing risk of bias was used to assess internal validity.18 As the included studies reported results on efficacy in a highly diverse and often incomplete manner, we performed an additional extraction round using a preference approach for extracting data for response and remission (Supplemental Appendix, sTable 2). This additional extraction was done by 1 reviewer (K.L.), whereas a second reviewer (K.S. or K.M.) crosschecked all extracted data and recalculated imputations. Adequacy of dosages tested was checked against guideline recommendations.19
Comparisons and Outcomes
As prespecified in the study protocol, we analyzed drugs according to their substance class.16 Efficacy end points were response (primary outcome defined as at least a 50% score reduction on a depression scale) and remission (secondary outcome defined as having a symptom score below a fixed threshold). Patients with missing data were considered nonresponders or nonremitters. In cases where responder and remission data were not reported, we imputed it from available score data.20 Acceptability outcomes were discontinuation (dropout) because of adverse effects (primary acceptability outcome), discontinuation for any reason, and the number of patients experiencing adverse effects. For our main analysis, we used data after completion of treatment. In the rare case in which treatment duration was longer than 3 months, the measurement point closest to 3 months was used. For the analysis of long-term effects, we used the measurement closest to 6 months after randomization.
Statistical Analyses
We used the odds ratio as the effect measure. Conventional meta-analyses of pairwise direct comparisons within studies were performed using the inverse variance weighted random effects model option in the Cochrane Informatics and Management Department RevMan 5.2 software. For network meta-analyses, a Bayesian framework (WinBUGS and R interface R2WinBUGS [http://www.r-project.org/]) following the recommendations of the UK Decision Support Unit of the National Institute of Health and Clinical Excellence (NICE) was used to combine direct and indirect evidence.21 To examine for inconsistency, we compared the results of the consistency model with that of an inconsistency model using the deviance information criterion.22 The potential impact of 4 prespecified (risk of bias, diagnostic subtype of depression, mean age of participants, and duration of treatment) and 2 post hoc defined covariates (dosage and sample size) was analyzed using a meta-regression model.23 Sensitivity analyses were performed excluding outlier studies. Funnel plots were produced for all direct comparisons with data from at least 5 trials.
RESULTS
Study Selection and Characteristics of Included Studies
A total of 66 studies with 15,161 patients met the inclusion criteria (Figure 1; see also the Supplemental Appendix, Section 2 for references and Section 3 for characteristics of individual studies). Four trials were available as unpublished reports only (3 from a drug company’s trial registry and 1 thesis). The 66 trials included 147 treatment groups. After the pooling of groups in which different dosages of the same agent or 2 agents of the same substance class had been tested, 140 treatment arms formed the basis of our analyses. In 37 treatment arms patients had received a tricyclic or tetracyclic antidepressive agent (TCA), in 37 a selective serotonin reuptake inhibitor (SSRI), in 6 a serotonin-noradrenaline reuptake inhibitor (SNRI), in 1 a noradrenaline reuptake inhibitor (NRI), in 5 a serotonin (5-HT2) antagonist and reuptake inhibitor (SARI), in 10 a noradrenergic and specific serotonergic antidepressive agent (NaSSA), in 6 a reversible inhibitor of monoamine oxidase A (rMAO-A), in 14 a hypericum extract (St. John’s wort) (Table 1, which also displays single agents tested) and in 24 a placebo. Initial dosages were always within the range of recommendations19 for starting treatment. In 11 treatment arms (from 10 trials), however, dosages remained below recommended standard dosage for all or most patients: in all 5 trazodone arms (150 mg in 3 and 100 to 200 mg in 2 trials), 2 amitriptyline arms (75 mg and 50 to 100 mg, respectively), 2 mianserin arms (40 mg and 30 to 60 mg, respectively), 1 clomipramine (40 mg), and 1 imipramine arm (mean dose 58 mg). In many trials daily dosages were in the lower range of recommendations for standard dosages. Thirty-eight (58%) studies were restricted to patients with major depression only, whereas 28 (42%) studies also included patients who had other depressive disorders or did not provide details on the exact type of depressive disorders involved (Table 2).
Risk of Bias
Seventeen trials (26%) reported the method of generating the allocation sequence, and 14 (21%) reported an adequate method of allocation concealment (Supplemental Appendix, sTable 3 for the assessment of individual trials). All remaining trials provided no information on this issue. In 57 (86%) double-blind trials without clear indications of unblinding, we considered the risk of bias to be low; in 9 (14%) trials either no blinding or unblinding seemed likely. Bias resulting from attrition seemed low in 21 (32%) studies and bias resulting from the reporting of outcomes was found in 49 (74%) studies. Overall risk of bias was considered to be low in 11 (17%), unclear in 23 (35%), and high in 32 (49%) studies.
Efficacy
Fifty-nine (89%) studies provided sufficient data to be included in the analysis of the main outcome measure response. For 17 of the 36 possible comparisons (46%) there was at least 1 head-to-head trial (Figure 2 for the network and Table 3 for pooled estimates; Section 4 of the Supplemental Appendix provides a forest plot with odds ratios of all individual trials). More than 3 comparative trials were available for TCAs vs SSRIs (n = 18), hypericum extracts vs placebo (n = 9), TCAs vs placebo (n = 8), SSRIs vs placebo (n = 7) and SSRIs vs hypericum extracts (n = 6). In 6 of the 13 direct comparisons with 2 or more trials, there was statistical heterogeneity (I2 >40% and/or P <.1 in the χ2 test). TCAs, SSRIs, SNRI, and hypericum extracts were significantly superior to placebo, but SNRI and NaSSAs were not. There were no trials comparing NRI, SARI, and rMAO-As with placebo. Funnel plots of comparisons with placebo were difficult to interpret because of the small numbers of studies per substance class (Supplemental Appendix, sFigures 3–7 for funnel plots). Visual inspection suggested asymmetry for trials comparing hypericum extract and placebo. The only significant differences between substance classes indicated superiority of TCAs compared with NaSSAs and rMAO-As, and of SARI compared with NaSSAs.
In the network meta-analysis, TCAs, SSRIs, SNRI, SARI (low-dose trazodone) and hypericum extracts were found to be significantly superior to the placebo, but effects were relatively small, with estimated odds ratios between 1.69 and 2.03 (Table 3). There were no significant differences between these drug classes, but 95% credible intervals were wide except for the comparison between TCAs and SSRIs. NRI, NaS-SAs, and rMAO-As were not significantly different from placebo. TCAs, SSRIs, and hypericum extracts were significantly superior to NaSSAs and rMAO-As. Hypericum extracts were also more effective than NRI. SNRI and low-dose SARI were superior to NaSSAs. We found no evidence of inconsistency between direct and indirect comparisons.
Fixed-effects analyses yielded results very similar to random-effects analyses, thus providing no evidence for heterogeneity. Meta-regression analyses did not show a significant influence of the type of depression (major depression or not), risk of bias, restriction to elderly patients, timing of the outcome measurement, underdosing, and sample size on treatment outcome. Correspondingly, model estimates for patients with major depression, for studies with adequate drug dosages, and for large trials were very similar to the unadjusted main estimates (Supplemental Appendix, Section 6). The exclusion of outlier studies did not have any relevant impact on the findings.
In secondary efficacy analyses using remission as the outcome, all drug classes were superior to placebo without significant differences between drug classes (Supplemental Appendix, Sections 6 and 7). Only 9 trials reported long-term outcomes (>12 weeks), and of these, only 1 included a placebo control group. Point estimates were similar (between 1.55 and 1.78) for the drugs that could be included in the network (SSRI, TCA, SNRI, NaSSA, and hypericum) but confidence intervals were very wide, and differences compared with placebo were not statistically significant.
Acceptability
Fifty-eight studies (88%) reported the number of patients dropping out because of adverse effects. The data for 5 studies could not be used in meta-analyses, however, because there were no dropouts attributable to side effects in any treatment group. For 16 of the 36 possible comparisons (44%) there was at least 1 head-to-head trial. In 4 of the 13 direct comparisons with 2 or more trials, there was an indication of statistical heterogeneity (Table 4). Because of the often very wide confidence intervals (caused by the small number of trials and low event rates), findings of conventional meta-analyses of head-to-head comparisons have to be interpreted with great caution (Supplemental Appendix, Section 3 for individual study findings). TCAs were associated with significantly more dropouts resulting from adverse effects compared with rMAO-As, the NRI was associated with significantly more than SSRIs, and NaSSAs were associated with significantly more than the low-dose SARI. In network meta-analysis combining direct and indirect evidence (Table 4), TCAs, SSRIs, SNRI, NRI, and NaSSAs were associated with significantly more study discontinuations resulting from adverse effects than was placebo. Attrition was not significantly different from placebo with (mostly low-dose) SARI, rMAO-As, and hypericum extracts. rMAO-As and hypericum extracts were associated with significantly fewer dropouts because of adverse effects compared with TCAs, SSRIs, SNRI, NRI, and NaSSAs. (For results on dropouts for any reasons and patients with adverse effects in comparison with placebo, see the Supplemental Appendix, Section 7).
DISCUSSION
This systematic review shows that a considerable number of randomized trials have investigated the short-term (up to 12 weeks) efficacy and acceptability of pharmacological treatments for depression in primary care. Yet, although we compared groups of similar interventions instead of single specific interventions, the number of trials in some substance groups was low. TCAs, SSRIs, and hypericum extracts were investigated more often in the primary care setting than other drug classes. Our primary efficacy analyses suggest that TCAs, SSRIs, SNRI, low-dose SARI, and hypericum extracts are effective for the treatment of acute depression, but effects when compared with placebo were modest in size. With 40% of patients responding to placebo, an odds ratio of 1.69 (as found for SSRIs) would mean that 53% of patients receiving an antidepressant respond. The absolute difference of 13% more than placebo corresponds to a number needed to treat between 7 and 8 patients. In secondary analyses using remission as the outcome, NRI, NaSSAs, and rMAO-As were also found to be effective. rMAO-As, hypericum extracts, and low-dose SARI tended to be associated with more favorable results in the acceptability analyses. The quality of most of the trials included in our review was mediocre or weak. Because there was a small number of studies with observation periods of longer than 12 weeks, reliable comparative analysis of long-term effects was not possible.
Conventional meta-analyses or randomized trials restricted to primary care patients have been performed for SSRIs and TCAs compared with placebo,11,12 SSRIs compared with TCAs,13 and a variety of newer antidepressants compared with different comparators.14 The reviews concluded that the reviewed antidepressants are superior to placebo in primary care patients.13,14 The evidence on the relative efficacy of antidepressants was sparse and considered to be of variable quality but suggested no major differences. Regarding acceptability, findings favored SSRIs and most newer antidepressants when compared with TCAs.13,14 Our review includes more primary care-based trials than the mentioned previous reviews taken together. In addition to conventional pairwise meta-analysis, we performed network meta-analysis, making efficient use of all available data. Our analyses confirm that short-term effects of SSRIs and TCAs are similar and that fewer patients receiving SSRIs report adverse effects, but we did not find any significant differences regarding study discontinuation resulting from adverse effects or for any reasons among the investigated substance classes. Our findings are in accordance with a conventional meta-analysis of trials mostly performed in specialist mental health care showing that the NRI reboxetine has relevant adverse effects and little efficacy.24 In the network meta-analysis hypericum extracts showed similar efficacy but better acceptability than SSRIs and TCAs. This result is in line with a conventional meta-analysis of hypericum trials from all settings.25 Hypericum extracts have a high risk of adverse interactions with other drugs,26 however, and patients with multiple morbid conditions are typically excluded from randomized trials. Furthermore, the hypericum meta-analysis25 also found that evidence of efficacy was considerably more solid in German-speaking regions than in other countries, including the United States, which makes interpretation difficult.
Because of the limited number of trials per single agents, our review cannot provide estimates of relative efficacy on this level. Network meta-analyses of single second-generation antidepressants have been performed across settings (mainly including trials from specialist mental health care). One review concluded that sertraline might be the best choice when starting treatment for moderate and severe depression,27 one favored escitalopram,28 and another considered the evidence insufficient to recommend a particular agent.29,30
We compared our findings with recommendations of current high-quality clinical practice guidelines from the United States,8 the United Kingdom,31 Canada,32,33 and Germany.19 Currently, none of these guidelines make recommendations that are specific to the primary care setting. Generally, for patients with whom pharmacotherapy is considered, the UK guideline recommends generic SSRIs as the first choice; the Canadian guideline recommends all second-generation antidepressants, and the US and the German guidelines are somewhat less explicit by listing a number of situations in which other choices might be preferable. The UK guideline explicitly states that antidepressants should not be used routinely for persistent subthreshold depressive symptoms or mild depression. All guidelines agree that there is evidence for the effectiveness of hypericum extracts for mild to moderate depression. But while the German and the Canadian guidelines are supportive of them when the risk of interactions is taken into account adequately, the UK guideline explicitly discourages their use. The US guideline does not make a recommendation.
Despite guideline recommendations, evidence from conventional and network meta-analyses, and widespread use, there is an ongoing discussion of the extent to which antidepressants have clinically relevant effects compared with placebo. In the meta-analyses the effects size compared with placebo is frequently considered rather small. Yet, findings from published trials tend to overestimate these effects because of publication and reporting bias. A meta-analysis of 74 trials of 12 second-generation antidepressants registered with the US Food and Drug Administration (FDA) indicated statistically significant effects when compared with placebo for all agents, but meta-analysis of published trials only would have inflated the effects size by 32% on average.34 For our review, we searched several trial registries of manufacturers, but because of the large number of agents covered in our review, we were unable to comprehensively search for unpublished trials. Owing to our limited resources and that many primary care trials are performed outside the approval process, we did not search the FDA database. We cannot rule out that our findings are distorted by publication or reporting bias to some extent.
Another widely cited meta-analysis of FDA-registered trials found that compared with placebo, medication effects became clinically relevant only in patients with very severe depression.9 This finding would be particularly important for primary care, as family physicians see many patients with less than severe depression. The meta-analysis included very few trials in patients with mild to moderate depression, however, and used aggregate data to investigate the association between baseline severity and outcome. A recent reanalysis of the same data set using different meta-analytic methods did not confirm the influence of initial severity on efficacy.35 Also, the findings of 2 newer meta-analyses of individual patient data—a more appropriate method—are contradictory.36,37 In our analyses, there were no major differences in trials limited to patients with major depression and trials that included other depressive patients. We could not investigate the association between baseline severity and outcome in detail because of the multiple different depression scales used in the primary studies.
Overall, despite publication bias and problems with methodological quality, findings seem to agree that antidepressants are significantly more effective than placebo. Most critical discussions focus on the clinical relevance of effects on the basis of meta-analytical effect sizes. Still, it should be noted that when comparing antidepressive medications with placebo alone, effect sizes cannot provide sufficient information on the clinical relevance of treatment effects in individual clinical contexts, so this issue is likely to remain controversial.9,38 The results of our analyses indicate that antidepressants have higher short-term effects when compared with placebo also in primary care.
For decision making in individual patients, family physicians should be aware that SSRIs and TCAs have a somewhat more solid evidence base than other substance classes (with SSRIs having a slightly better acceptability profile). Further agents (hypericum, rMAO-A, SNRI, NRI, NaSSA, SARI) showed some positive results, but limitations of the currently available evidence make difficult a clear recommendation on their place in clinical practice. Differences between substance classes and between single second-generation antidepressants seem to be relatively minor. The latter findings come from network meta-analyses of trials mostly performed in specialized mental health care, but our results suggest that results from primary care trials and other trials are broadly similar.
It must be emphasized that there are very few data from primary care trials regarding long-term effectiveness and acceptability. Future research should prioritize large, long-term, pragmatic trials and observational studies addressing clinically relevant questions, such as the best management of mild-to-moderate depression and comparison of pharmacological and psychological treatments under conditions of routine care and stepped-care strategies
Acknowledgment
The contribution of Susanne Jamil to this work is part of her ongoing doctor of medicine thesis at the Medical Faculty of the Technische Unviversität München, Munich, Germany.
Footnotes
-
Conflicts of interest: authors report none.
-
Funding support: The study was funded by the German Ministry of Education and Research (grant 01KG1012).
-
Disclaimer: The sponsor was not involved in the design, data collection, analysis, interpretation, writing of the report or decision to submit the report for publication.
-
Systematic review registration: 01KG1012 at http://www.gesundheitsforschung-bmbf.de/de/2852.php.
-
Supplementary materials: Available at http://www.AnnFamMed.org/content/13/1/69/suppl/DC1/.
- Received for publication January 30, 2014.
- Revision received May 16, 2014.
- Accepted for publication June 13, 2014.
- © 2015 Annals of Family Medicine, Inc.