Abstract
PURPOSE Because recognition and management of patients with somatoform disorders are difficult, we wanted to determine the specificity, sensitivity, and the test-retest reliability of the 15-symptom Patient Health Questionnaire (PHQ-15) for detection of somatoform disorders in a high-risk primary care population.
METHODS We studied the performance of the PHQ-15 in comparison with the Structured Clinical Interview for the Diagnostic and Statistical Manual-IV Axis I disorders (SCID-I) as a reference standard. From January through September 2006, we approached patients for participation. This study was conducted in primary care settings in the Netherlands. Patients aged between 18 and 70 years were eligible if they belonged to 1 or more of the following groups: (1) patients with unexplained somatic complaints, (2) frequent attenders, and (3) patients with mental health problems. For the SCID-I interview we invited all patients with a PHQ-15 score of 6 or greater and a random sample of 30% of patients with a PHQ-15 score of less than 6. The primary study outcomes were the sensitivity and specificity for the validity and the κ coefficient for the test-retest reliability.
RESULTS Of 2,147 eligible patients, 906 (42%) participated (mean age 48 years, 62% female). At a cutoff level of 3 or more severe somatic symptoms during the past 4 weeks, sensitivity was 78% and specificity 71%. The test-retest reliability was 0.60.
CONCLUSIONS The PHQ-15 is a valid and moderately reliable questionnaire for the detection of patients in a primary care setting at risk for somatoform disorders.
INTRODUCTION
In primary care 20% to 50% of all patients complaining of physical symptoms can be categorized as having medically unexplained symptoms.1,2 Earlier research shows that the criteria for somatoform disorders are met in 10% to 16% of all primary care patients.3–5 Usually, the medically unexplained symptoms spontaneously resolve or improve by effective management. Sometimes the complaints persist, leading to functional impairment.6
Somatoform disorders are a burden for both patients and family physicians. Patients with these disorders are at risk of overtesting and unnecessary treatment,7,8 and the doctor-patient relationship is often difficult and strained.9 It is a challenge for physicians to improve their competence in recognizing and managing patients with somatoform disorders, and a screening questionnaire for somatoform disorders might be helpful.
We wanted to test a screening questionnaire in a subgroup of patients for whom family physicians will most likely use the instrument. Because screening for early detection in a high-risk population is a key concept in family medicine,10 we opted to screen the following population in the context of regular primary care: frequent attenders and patients who were identified by their family physicians as having either mental health problems or unexplained somatic complaints.
We used the Dutch version of the Patient Health Questionnaire (PHQ), a short, self-report version of the Primary Care Evaluation of Mental Disorders (PRIME-MD).11 The PHQ-15, the somatic symptom severity scale of the PHQ, is a self-administered diagnostic instrument developed for detection of somatoform disorders that consists of a list of 15 somatic symptoms.11 Those 15 symptoms constitute most of the physical complaints in primary care.3
The test characteristics of the PHQ-15 have been studied by Kroenke et al.4,12 Increasing scores on the PHQ-15 are strongly associated with functional impairment, disability, and health care use.12 Kroenke at al found a high internal reliability and established its construct validity by strong associations with functional status, disability days, and symptom-related difficulty.4 Interian et al reproduced the high internal reliability and established the convergence of the PHQ-15 with the Composite International Diagnostic Interview.13 Data on test-retest reliability of the PHQ-15, however, are still lacking.
We addressed 2 questions: (1) is the PHQ-15 a suitable questionnaire for the detection of somatoform disorders in a high-risk primary care population, and (2) what is the test-retest reliability of the PHQ-15?
METHODS
We compared the performance of the PHQ-15 with that of our reference standard, the Structured Clinical Interview for the Diagnostic and Statistical Manual-IV (DSM-IV) Axis I disorders (SCID-I), a diagnostic interview for DSM-IV diagnoses.14 The study was conducted in primary care settings in 2 regions in the Netherlands. From January through September 2006, we approached patients aged between 18 and 70 years to participate. The institutional ethics review committees of both centers approved the study protocol.
Study Population
Our study took place within a project that was originally designed for screening for depression in a primary care population. We predefined 3 groups of patients who had a high risk for depression.
Unexplained Somatic Complaints
Patients in the unexplained somatic complaints (USC) group had somatic complaints that could not be explained by a somatic condition. These complaints had to be present for at least 3 months. As it is not possible to code unexplained somatic complaints using a standard classification system, as such the International Classification of Primary Care (ICPC), we asked family physicians to identify these patients by checking their appointment lists for the 4 weeks preceding study allocation and selecting patients fulfilling the criterion of having an unexplained medical complaint for at least 3 months.
Frequent Attenders
Patients in the frequent attenders (FA) group had attendance rates for primary care in the highest 10% according to the method proposed by Howe et al15: the 10% most frequently consulting women and the 10% most frequently attending men in 2 age-groups (18 to 44 and 45 to 70 years) in the year preceding study allocation. This method accounted for differences in sex and age among frequently attending patients. We used computerized attendance data from all practice visits, home visits, and telephone consultations with doctors, nurses, and other team members. The highest 10% was determined separately for each family physician because of differences in practice styles.
Mental Health Problems
Patients in the mental health problems (MHP) group visited their family physicians with a new mental health problem up to 3 months before the selection date. The time frame of 3 months was chosen because of the transitory nature of most mental health problems. We selected these patients from electronic patient databases of the participating family physicians, who were accustomed to coding all diagnoses or complaints with the ICPC classification system. Patients with a psychological or social reason for encounter or with a mental health diagnosis can be classified in the P and Z chapters. To identify all patients with possible mental health problems, we searched the electronic patient database for codes from the P and Z classes with the following predefined free-text words: anxiety, worrying, sadness, stress, feeling down, and insomnia.
Procedure
Family physicians received a list of selected patients from which they excluded those suffering from schizophrenia, psychosis, bipolar disorder, serious somatic disease, or mental retardation, or having difficulties with Dutch or English language. We also excluded patients with a diagnosis of depression at baseline.
Next, we mailed all selected patients a letter signed by the family physician describing the purpose and content of the study and asking for their participation, including treatment in a trial setting, together with an informed consent form and the screening instrument, the PHQ-15. If patients did not respond within 2 weeks, a reminder was mailed.
In accordance with earlier studies,16,17 participants who completed the PHQ-15 screening questionnaire and had 3 or more severe somatic symptoms (scoring 6 or higher) were referred to as cases, and those who had fewer than 3 severe somatic symptoms were referred to as noncases. When we decided to use this cutoff point, we had not yet decided to exclude 2 symptoms from the final analysis.
To assess the criterion validity of the PHQ-15, we invited all case participants and a random 40% of non-case participants for a SCID-I interview 2 weeks after receiving the PHQ-15. To determine the test-retest reliability of the PHQ-15, we gave the patients the PHQ-15 twice: they were asked to fill out the PHQ-15 at baseline and then 2 weeks later, on the same day as the SCID-I interview.
Measurement Instruments
Patient Health Questionnaire-15
The PHQ-15 is a somatic symptom severity scale for the purpose of diagnosing somatoform disorders. It inquires about 15 somatic symptoms or symptom clusters that account for more than 90% of the physical complaints (excluding upper respiratory tract symptoms) reported in the outpatient setting. For 13 of the somatic symptoms, subjects are asked, “During the past 4 weeks, how much have you been bothered by any of the following problems?” The 3 scoring options are coded as 0 (not bothered at all), 1 (bothered a little), or 2 (bothered a lot). A somatic symptom with the score of 2 is considered severe.
For the 2 somatic symptoms that are also part of the PHQ depression module—feeling tired or having little energy, and trouble sleeping—subjects are asked, “Over the past 2 weeks, how often have you been bothered by any of the following problems?” The 4 scoring options are coded as 0 (not at all), 1 (several days), or 2 (more than half the days, or nearly every day). A symptom score of 2 is considered to be severe.
According to the original algorithm of the PHQ-15, in a primary care population the test is considered positive when 3 or more severe somatic symptoms are present, which is indicated by a test result of 6 or higher.11
Structured Clinical Interview for DSM-IV Disorders
The SCID-I is a semi-structured interview for diagnosing Axis I mental disorders according to DSM-IV criteria.14 Interviewers, who received SCID-I training from an experienced psychiatrist, administered the SCID-I by telephone. A structured set of questions directed the interviewer in determining whether the symptoms (1) cannot be fully explained by a general medical condition, another mental disorder, or the effects of a substance; and (2) cause serious impairment in social, occupational, or other functioning. The interviewers had meetings every 2 weeks with the psychiatrist to secure the quality of the interviews. Agreement between a diagnosis gained from telephone and that from a live administration of the SCID-I is excellent.18
Statistical Analysis
Prevalence
We analyzed the data with SAS 9 (SAS Institute, Cary, North Carolina). To calculate the prevalence of somatoform disorders in our population, we had to correct for using only the random sample of 40% of noncase participants with inverse probability weighting.19 After correcting for this imbalance, all further calculations were performed with these balanced data, except for the calculation of the receiver operating characteristic (ROC) curve, the optimal cutoff point, and test-retest reliability.
Criterion Validity
We assessed the criterion validity of the first PHQ-15 by calculating sensitivity and specificity using different cutoff values, which we visualized in a ROC curve. We assessed the utility for everyday practice by calculating positive and negative predictive values, using the optimal cutoff value.
Internal Consistency
A factor analysis of the PHQ-15 showed that 2 symptoms were only weakly associated with the factor: menstrual problems (item-total correlation [ITC] 0.26) and sexual pain/problems (ITC 0.18). Kroenke et al found similar results.20 We therefore decided to exclude these symptoms from our analysis. Thus, the total score of the total 13-item PHQ-15 in our analysis ranged from 0 to 26, compared with 0 to 30 when all 15 items of the PHQ-15 were scored.
Test-Retest Reliability
We calculated the intraclass correlation coefficient to assess the test-retest consistency of the PHQ-15. Using the paired Student t test, we calculated the P value for the difference between the first and second PHQ-15 outcomes. Next, we dichotomized the PHQ-15 outcomes into cases and noncases and compared the first and the second PHQ-15 outcomes using the κ statistic, a measure of agreement that takes into account the influence of chance. We measured the influence of time on the first and second PHQ-15 scores using logistic regression analysis.
RESULTS
Thirty-five family physicians participated. In total, 2,659 patients fulfilled the criteria for mental health problems (MHP, n = 1,039), for frequently attending their family physician (FA, n = 1,745), and for unexplained somatic complaints (USC, n = 183). There was overlap among the groups. The mean age was 45 years, and 60% were female (Figure 1⇓).
Family physicians excluded 345 patients from the study for the following reasons: death of the patient (7), being too old (13), schizophrenia or bipolar disease (43), inability to understand the Dutch or English language (49), terminal illness or mental retardation (71), and serious illness (162). Additionally we excluded 167 patients who were already known by their family physician to have major depressive disorder.
Of the remaining 2,147 patients eligible for PHQ-15 screening, 904 (42%) patients returned the PHQ-15 and gave informed consent: 68 patients in the USC group, 344 in the MHP group, and 586 patients in the FA group (Figure 2⇓). Consenting patients were slightly older (mean age 48 years).
Prevalence
Of the 904 patients, 602 (66%) patients had fewer than 3 severe somatic symptoms. The other 302 (33%) with 3 or more severe somatic symptoms (score of 6 or higher), were considered to have a positive score. Patients in the MHP group had the lowest prevalence of a positive PHQ-15 at 31%, patients in the FA group had higher prevalence of a positive PHQ-15 at 35%, and patients in the USC group had the highest prevalence of a positive PHQ-15 at 63%.
Of the 426 patients who participated in the SCID-I interview, we diagnosed a somatoform disorder in 51. The MHP group had the lowest prevalence at 8.7%, the FA group a higher prevalence at 11%, and highest prevalence was in the USC group at 32%. Those 426 patients are a subgroup of our original population, which had a preplanned overrepresentation of patients with a positive outcome on the PHQ-15. After correction by inverse probability weighting for the 30% patient sample with negative PHQ-15 outcomes, the prevalence of somatoform disorders in our study population was 8.6%.
Sensitivity and Specificity
We assessed the optimal physical symptom threshold for somatoform disorders with a ROC curve for the nonweighted sample (Figure 3⇓). The optimal sum of sensitivity and specificity of the PHQ-15 is found at 3 or more severe somatic symptoms (Table 1⇓). The accuracy of the PHQ-15 is fair, with an area under the ROC curve of 0.76.
After correction with inverse probability weighting, the sensitivity of the PHQ-15 (at a cutoff level of 3 or more severe somatic symptoms) was 78% and specificity was 71%, which yields a likelihood ratio for a positive test of 2.70 and a likelihood ratio for a negative test of 0.31. The positive predictive value shows that 21% of patients who have 3 or more severe somatic symptoms on the PHQ-15 (score ≥6) will have a somatoform disorder. The negative predictive value of 97% indicates that only 3% of patients who have fewer than 3 severe somatic symptoms will have a somatoform disorder.
Reliability
We assessed test-retest reliability with the data from 355 patients who completed the second PHQ-15 within 14 days after the first PHQ-15. This sample contains 63% of the patients invited for the SCID-I (n = 564) and the second PHQ-15. The remaining 37% did not participate in the SCID-I interview, nor did they complete the second PHQ-15.
By counting only the scores of 2, indicating severe somatic symptoms, the mean score of the first PHQ-15 was 6.1 points (SD, 5.3); for the second PHQ-15, the mean score was 5.5 points (SD, 5.3), a decrease of 0.6 points (P <.001). The intraclass correlation coefficient was 0.83. Next, we dichotomized the outcome, that is, patients with 3 or more severe somatic symptoms were considered to have a positive PHQ-15 score and patients with fewer than 3 severe somatic symptoms were considered to have a negative PHQ-15 score. On the dichotomized outcome the percentage agreement between the first and second PHQ-15 score was 80%. The score changed from negative to positive in 6%, and from positive to negative in 14%. The κ coefficient was 0.60. A logistic regression analysis with time as the independent variable found the following P values: P = .38 for negative PHQ-15 outcomes changing to positive, and P = .79 for positive PHQ-15 outcomes changing to negative. So, there is no significant influence of time on the difference in results from the first and second PHQ-15.
The internal consistency (Cronbach’s α) of the PHQ-15 is .80.
DISCUSSION
The sensitivity and specificity of the PHQ-15, as measured by the concordance with the SCID-I diagnosis of somatoform disorders, have been established in our primary care population as 78% and 71%, respectively, with a low positive predictive value and a high negative predictive value. The test-retest reliability is moderate with a κ coefficient of 0.60.21 The prevalence of somatoform disorders differed significantly between the 3 high-risk groups. The patients identified by their family physicians as being in the USC group had by far the highest prevalence of severe somatic symptoms. Diagnosis of somatoform disorders was 3 times more likely in this group than in the MHP and FA groups.
Strengths and Limitations
We excluded patients who had a diagnosis of depression at baseline for both research and clinical reasons. Often patients with depression have somatic complaints that could fit the diagnosis of a somatoform disorder, but usually their depression better accounts for those complaints. The PHQ-15 measures symptoms regardless of underlying disorders. In contrast, the SCID-I will lead to a diagnosis of somatoform disorders only if complaints are not accounted for by another mental disorder. Accordingly, the relatively low prevalence of somatoform disorders in our study population (8.6%) might be because we excluded the patients with known depression at baseline.
The suitability of the SCID-I to diagnose somatoform disorders has been criticized. The best-estimate diagnosis is considered to be more accurate in this respect.22 The best-estimate diagnosis consists of longitudinal assessment, done by expert diagnosticians, using all available patient data, such as obtained from family informants, review of medical records, and observations of clinical staff. Although this standard is appealing, it is often not used in research practice, so for practical reasons, we have chosen to use the SCID-I.
We found a high internal consistency (α = .80), which replicates the findings of Kroenke et al (α = .80) and Interian et al (α = .79).4,13 Interian et al measured the convergent validity of the PHQ-15 with the Composite International Diagnostic Interview symptom count in patients with moderate to severe somatization. They found a significantly lower validity in the Hispanic population. Their results are difficult to compare with ours because we used a different validation instrument, and we used it in a mainly Dutch population.
For the diagnosis of a somatoform disorder, the complaints are necessarily medically unexplained. Such a diagnosis requires clinical judgment, which a questionnaire cannot provide. One might expect that patients with known physical disorders have many somatic symptoms and therefore high scores on the PHQ-15. In earlier research, however, only a weak correlation was found between the number of physical disorders and the number of somatic symptoms.4 Total symptom counts, including unexplained and explained, have been proved to be prognostic for somatoform disorders.20,23
As we excluded patients with mental retardation and patients having difficulties with Dutch or English language, all included patients were able to read and understand the questions. We performed our research in clinically relevant family practice subgroups during routine practice. Patients who frequently attend, patients who have mental health problems, and patients with unexplained symptoms are at risk for somatization and thus for unnecessary medical procedures and problematic doctor-patient relationships. With this procedure we increased the chance of detecting a meaningful result. Moreover, we tested the instrument in the subgroup of patients for whom family physicians are likely to use it.
The response to our first PHQ-15 measurement was low (42%). Usually around 50% of subjects respond to questionnaires. We did not ask patients only to return the questionnaire, however; we also asked for their participation within the whole project, including treatment in a trial setting. We assume that the patients, especially patients with mental health problems, might have been less willing to return the PHQ-15 because they did not want to take part in the trial.
This study is the first to examine the test-retest validity of the PHQ-15, which we found to be moderate. Although we expected time between tests to affect the outcome, we could not find an influence of time on test-retest reliability. The decrease in PHQ-15 scores between the test and the retest could be explained by both the natural course of symptoms and by regression toward the mean.
Implications for Research and Clinical Practice
The PHQ-15 has proved to be a valid and moderately reliable instrument for recognition of somatoform disorders in our primary care study population. For implementation into clinical practice, one should realize that we excluded patients with a depression. The negative predictive value of the PHQ-15 (97%) offers a considerable advantage in family medicine, where incidences are usually low. This short questionnaire can be used to exclude the diagnosis of a somatoform disorder in most patients. For a small group of patients, further discussion of a patient’s symptoms will be necessary to draw firm conclusions. This course fits well into the primary care process. Taking into consideration the complex nature of somatization, the PHQ-15 might bring us the closest we can get to objectively identifying patients at high risk for somatoform disorders.
Acknowledgments
We are grateful for the work and support of Lea Peters, Jan Mulder, Marlien Douma, and Machteld Borghuis. We would like to thank the participating family physicians and patients.
Footnotes
-
Conflicts of interest: none reported.
-
Funding support: Financial support was given to this study by The Netherlands Organization for Health Research and Development (ZonMW).
- Received for publication March 12, 2008.
- Revision received July 4, 2008.
- Accepted for publication August 25, 2008.
- © 2009 Annals of Family Medicine, Inc.