scholarly journals Accuracy of Patient Health Questionnaire-9 (PHQ-9) for screening to detect major depression: individual participant data meta-analysis

BMJ ◽  
2019 ◽  
pp. l1476 ◽  
Author(s):  
Brooke Levis ◽  
Andrea Benedetti ◽  
Brett D Thombs

Abstract Objective To determine the accuracy of the Patient Health Questionnaire-9 (PHQ-9) for screening to detect major depression. Design Individual participant data meta-analysis. Data sources Medline, Medline In-Process and Other Non-Indexed Citations, PsycINFO, and Web of Science (January 2000-February 2015). Inclusion criteria Eligible studies compared PHQ-9 scores with major depression diagnoses from validated diagnostic interviews. Primary study data and study level data extracted from primary reports were synthesized. For PHQ-9 cut-off scores 5-15, bivariate random effects meta-analysis was used to estimate pooled sensitivity and specificity, separately, among studies that used semistructured diagnostic interviews, which are designed for administration by clinicians; fully structured interviews, which are designed for lay administration; and the Mini International Neuropsychiatric (MINI) diagnostic interviews, a brief fully structured interview. Sensitivity and specificity were examined among participant subgroups and, separately, using meta-regression, considering all subgroup variables in a single model. Results Data were obtained for 58 of 72 eligible studies (total n=17 357; major depression cases n=2312). Combined sensitivity and specificity was maximized at a cut-off score of 10 or above among studies using a semistructured interview (29 studies, 6725 participants; sensitivity 0.88, 95% confidence interval 0.83 to 0.92; specificity 0.85, 0.82 to 0.88). Across cut-off scores 5-15, sensitivity with semistructured interviews was 5-22% higher than for fully structured interviews (MINI excluded; 14 studies, 7680 participants) and 2-15% higher than for the MINI (15 studies, 2952 participants). Specificity was similar across diagnostic interviews. The PHQ-9 seems to be similarly sensitive but may be less specific for younger patients than for older patients; a cut-off score of 10 or above can be used regardless of age.. Conclusions PHQ-9 sensitivity compared with semistructured diagnostic interviews was greater than in previous conventional meta-analyses that combined reference standards. A cut-off score of 10 or above maximized combined sensitivity and specificity overall and for subgroups. Registration PROSPERO CRD42014010673.

BMJ ◽  
2021 ◽  
pp. n2183
Author(s):  
Zelalem F Negeri ◽  
Brooke Levis ◽  
Ying Sun ◽  
Chen He ◽  
Ankur Krishnan ◽  
...  

Abstract Objective To update a previous individual participant data meta-analysis and determine the accuracy of the Patient Health Questionnaire-9 (PHQ-9), the most commonly used depression screening tool in general practice, for detecting major depression overall and by study or participant subgroups. Design Systematic review and individual participant data meta-analysis. Data sources Medline, Medline In-Process, and Other Non-Indexed Citations via Ovid, PsycINFO, Web of Science searched through 9 May 2018. Review methods Eligible studies administered the PHQ-9 and classified current major depression status using a validated semistructured diagnostic interview (designed for clinician administration), fully structured interview (designed for lay administration), or the Mini International Neuropsychiatric Interview (MINI; a brief interview designed for lay administration). A bivariate random effects meta-analytic model was used to obtain point and interval estimates of pooled PHQ-9 sensitivity and specificity at cut-off values 5-15, separately, among studies that used semistructured diagnostic interviews (eg, Structured Clinical Interview for Diagnostic and Statistical Manual), fully structured interviews (eg, Composite International Diagnostic Interview), and the MINI. Meta-regression was used to investigate whether PHQ-9 accuracy correlated with reference standard categories and participant characteristics. Results Data from 44 503 total participants (27 146 additional from the update) were obtained from 100 of 127 eligible studies (42 additional studies; 79% eligible studies; 86% eligible participants). Among studies with a semistructured interview reference standard, pooled PHQ-9 sensitivity and specificity (95% confidence interval) at the standard cut-off value of ≥10, which maximised combined sensitivity and specificity, were 0.85 (0.79 to 0.89) and 0.85 (0.82 to 0.87), respectively. Specificity was similar across reference standards, but sensitivity in studies with semistructured interviews was 7-24% (median 21%) higher than with fully structured reference standards and 2-14% (median 11%) higher than with the MINI across cut-off values. Across reference standards and cut-off values, specificity was 0-10% (median 3%) higher for men and 0-12 (median 5%) higher for people aged 60 or older. Conclusions Researchers and clinicians could use results to determine outcomes, such as total number of positive screens and false positive screens, at different PHQ-9 cut-off values for different clinical settings using the knowledge translation tool at www.depressionscreening100.com/phq . Study registration PROSPERO CRD42014010673.


BMJ ◽  
2020 ◽  
pp. m4022
Author(s):  
Brooke Levis ◽  
Zelalem Negeri ◽  
Ying Sun ◽  
Andrea Benedetti ◽  
Brett D Thombs

Abstract Objective To evaluate the Edinburgh Postnatal Depression Scale (EPDS) for screening to detect major depression in pregnant and postpartum women. Design Individual participant data meta-analysis. Data sources Medline, Medline In-Process and Other Non-Indexed Citations, PsycINFO, and Web of Science (from inception to 3 October 2018). Eligibility criteria for selecting studies Eligible datasets included EPDS scores and major depression classification based on validated diagnostic interviews. Bivariate random effects meta-analysis was used to estimate EPDS sensitivity and specificity compared with semi-structured, fully structured (Mini International Neuropsychiatric Interview (MINI) excluded), and MINI diagnostic interviews separately using individual participant data. One stage meta-regression was used to examine accuracy by reference standard categories and participant characteristics. Results Individual participant data were obtained from 58 of 83 eligible studies (70%; 15 557 of 22 788 eligible participants (68%), 2069 with major depression). Combined sensitivity and specificity was maximised at a cut-off value of 11 or higher across reference standards. Among studies with a semi-structured interview (36 studies, 9066 participants, 1330 with major depression), sensitivity and specificity were 0.85 (95% confidence interval 0.79 to 0.90) and 0.84 (0.79 to 0.88) for a cut-off value of 10 or higher, 0.81 (0.75 to 0.87) and 0.88 (0.85 to 0.91) for a cut-off value of 11 or higher, and 0.66 (0.58 to 0.74) and 0.95 (0.92 to 0.96) for a cut-off value of 13 or higher, respectively. Accuracy was similar across reference standards and subgroups, including for pregnant and postpartum women. Conclusions An EPDS cut-off value of 11 or higher maximised combined sensitivity and specificity; a cut-off value of 13 or higher was less sensitive but more specific. To identify pregnant and postpartum women with higher symptom levels, a cut-off of 13 or higher could be used. Lower cut-off values could be used if the intention is to avoid false negatives and identify most patients who meet diagnostic criteria. Registration PROSPERO (CRD42015024785).


BMJ ◽  
2021 ◽  
pp. n972
Author(s):  
Yin Wu ◽  
Brooke Levis ◽  
Ying Sun ◽  
Chen He ◽  
Ankur Krishnan ◽  
...  

AbstractObjectiveTo evaluate the accuracy of the depression subscale of the Hospital Anxiety and Depression Scale (HADS-D) to screen for major depression among people with physical health problems.DesignSystematic review and individual participant data meta-analysis.Data sourcesMedline, Medline In-Process and Other Non-Indexed Citations, PsycInfo, and Web of Science (from inception to 25 October 2018).Review methodsEligible datasets included HADS-D scores and major depression status based on a validated diagnostic interview. Primary study data and study level data extracted from primary reports were combined. For HADS-D cut-off thresholds of 5-15, a bivariate random effects meta-analysis was used to estimate pooled sensitivity and specificity, separately, in studies that used semi-structured diagnostic interviews (eg, Structured Clinical Interview for Diagnostic and Statistical Manual of Mental Disorders), fully structured interviews (eg, Composite International Diagnostic Interview), and the Mini International Neuropsychiatric Interview. One stage meta-regression was used to examine whether accuracy was associated with reference standard categories and the characteristics of participants. Sensitivity analyses were done to assess whether including published results from studies that did not provide raw data influenced the results.ResultsIndividual participant data were obtained from 101 of 168 eligible studies (60%; 25 574 participants (72% of eligible participants), 2549 with major depression). Combined sensitivity and specificity was maximised at a cut-off value of seven or higher for semi-structured interviews, fully structured interviews, and the Mini International Neuropsychiatric Interview. Among studies with a semi-structured interview (57 studies, 10 664 participants, 1048 with major depression), sensitivity and specificity were 0.82 (95% confidence interval 0.76 to 0.87) and 0.78 (0.74 to 0.81) for a cut-off value of seven or higher, 0.74 (0.68 to 0.79) and 0.84 (0.81 to 0.87) for a cut-off value of eight or higher, and 0.44 (0.38 to 0.51) and 0.95 (0.93 to 0.96) for a cut-off value of 11 or higher. Accuracy was similar across reference standards and subgroups and when published results from studies that did not contribute data were included.ConclusionsWhen screening for major depression, a HADS-D cut-off value of seven or higher maximised combined sensitivity and specificity. A cut-off value of eight or higher generated similar combined sensitivity and specificity but was less sensitive and more specific. To identify medically ill patients with depression with the HADS-D, lower cut-off values could be used to avoid false negatives and higher cut-off values to reduce false positives and identify people with higher symptom levels.Trial registrationPROSPERO CRD42015016761.


2020 ◽  
Vol 90 (1) ◽  
pp. 28-40 ◽  
Author(s):  
Yin Wu ◽  
Brooke Levis ◽  
John P.A. Ioannidis ◽  
Andrea Benedetti ◽  
Brett D. Thombs ◽  
...  

<b><i>Introduction:</i></b> Three previous individual participant data meta-analyses (IPDMAs) reported that, compared to the Structured Clinical Interview for the DSM (SCID), alternative reference standards, primarily the Composite International Diagnostic Interview (CIDI) and the Mini International Neuropsychiatric Interview (MINI), tended to misclassify major depression status, when controlling for depression symptom severity. However, there was an important lack of precision in the results. <b><i>Objective:</i></b> To compare the odds of the major depression classification based on the SCID, CIDI, and MINI. <b><i>Methods:</i></b> We included and standardized data from 3 IPDMA databases. For each IPDMA, separately, we fitted binomial generalized linear mixed models to compare the adjusted odds ratios (aORs) of major depression classification, controlling for symptom severity and characteristics of participants, and the interaction between interview and symptom severity. Next, we synthesized results using a DerSimonian-Laird random-effects meta-analysis. <b><i>Results:</i></b> In total, 69,405 participants (7,574 [11%] with major depression) from 212 studies were included. Controlling for symptom severity and participant characteristics, the MINI (74 studies; 25,749 participants) classified major depression more often than the SCID (108 studies; 21,953 participants; aOR 1.46; 95% confidence interval [CI] 1.11–1.92]). Classification odds for the CIDI (30 studies; 21,703 participants) and the SCID did not differ overall (aOR 1.19; 95% CI 0.79–1.75); however, as screening scores increased, the aOR increased less for the CIDI than the SCID (interaction aOR 0.64; 95% CI 0.52–0.80). <b><i>Conclusions:</i></b> Compared to the SCID, the MINI classified major depression more often. The odds of the depression classification with the CIDI increased less as symptom levels increased. Interpretation of research that uses diagnostic interviews to classify depression should consider the interview characteristics.


Sign in / Sign up

Export Citation Format

Share Document