Skip to main content

Main menu

  • Home
  • Current Issue
  • Content
    • Current Issue
    • Early Access
    • Multimedia
    • Podcast
    • Collections
    • Past Issues
    • Articles by Subject
    • Articles by Type
    • Supplements
    • Plain Language Summaries
    • Calls for Papers
  • Info for
    • Authors
    • Reviewers
    • Job Seekers
    • Media
  • About
    • Annals of Family Medicine
    • Editorial Staff & Boards
    • Sponsoring Organizations
    • Copyrights & Permissions
    • Announcements
  • Engage
    • Engage
    • e-Letters (Comments)
    • Subscribe
    • Podcast
    • E-mail Alerts
    • Journal Club
    • RSS
    • Annals Forum (Archive)
  • Contact
    • Contact Us
  • Careers

User menu

  • My alerts

Search

  • Advanced search
Annals of Family Medicine
  • My alerts
Annals of Family Medicine

Advanced Search

  • Home
  • Current Issue
  • Content
    • Current Issue
    • Early Access
    • Multimedia
    • Podcast
    • Collections
    • Past Issues
    • Articles by Subject
    • Articles by Type
    • Supplements
    • Plain Language Summaries
    • Calls for Papers
  • Info for
    • Authors
    • Reviewers
    • Job Seekers
    • Media
  • About
    • Annals of Family Medicine
    • Editorial Staff & Boards
    • Sponsoring Organizations
    • Copyrights & Permissions
    • Announcements
  • Engage
    • Engage
    • e-Letters (Comments)
    • Subscribe
    • Podcast
    • E-mail Alerts
    • Journal Club
    • RSS
    • Annals Forum (Archive)
  • Contact
    • Contact Us
  • Careers
  • Follow annalsfm on Twitter
  • Visit annalsfm on Facebook
Research ArticleOriginal Research

Evaluation of an AI-Based Voice Biomarker Tool to Detect Signals Consistent With Moderate to Severe Depression

Alexa Mazur, Harrison Costantino, Prentice Tom, Michael P. Wilson and Ronald G. Thompson
The Annals of Family Medicine January 2025, 23 (1) 60-65; DOI: https://doi.org/10.1370/afm.240091
Alexa Mazur
1Kintsugi Mindful Wellness, Inc, San Francisco, California
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: aam2213@columbia.edu
Harrison Costantino
2Department of Computer Science, University of California, Berkeley, California
MS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Prentice Tom
1Kintsugi Mindful Wellness, Inc, San Francisco, California
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael P. Wilson
3Departments of Psychiatry and Emergency Medicine, University of Arkansas for Medical Sciences, Little Rock, Arkansas
MD, PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ronald G. Thompson
3Departments of Psychiatry and Emergency Medicine, University of Arkansas for Medical Sciences, Little Rock, Arkansas
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data
  • eLetters
  • Info & Metrics
  • PDF
Loading

Article Figures & Data

Figures

  • Tables
  • Additional Files
  • Figure 1.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 1.

    Participant Exclusion and Audio Preprocessing Criteria Used to Create Training and Validation Sets to Train and Tune the Model and Evaluate its Performance

    Note: Eligible participants for inclusion in the analysis data sets were adults aged ≥18 years living in the United States or Canada who provided a voice sample in English containing at least 25 seconds of speech content meeting audio quality parameters. The training set and validation set were split to evenly distribute samples on the basis of participant characteristics and audio length.

Tables

  • Figures
  • Additional Files
    • View popup
    Table 1.

    Participant Demographic Characteristics

    CharacteristicTrainingValidation
    Age, y
        Average (SD)37.3 (14.3)37.3 (14.2)
        Median34.034.0
        Mode22.025.0
        Range18-9318-86
    Gender, %
        Female69.569.4
        Male27.327.9
        Not specified  2.4  2.1
        Other  0.9  0.7
    Race/ethnicity, %
        Asian or Pacific Islander15.916.2
        Black or African American  9.4  9.4
        Hispanic or Latine  7.5  7.7
        Native American or American Indian  1.2  1.3
        Not specified  1.5  1.8
        Other or mixed race  5.9  5.8
        White58.557.8
    Audio duration, s
        Average (SD)55.1 (10.1)55.0 (10.1)
        Median57.957.9
        Mode58.658.5
        Range25.0-74.925.0-74.9
    PHQ-9 score
        Average (SD)  9.8 (6.7)  9.7 (6.7)
        Median  9.0  9.0
        Mode  9.0  0
        Range0-270-27
    • PHQ-9 = Patient Health Questionnaire-9

    • View popup
    Table 2.

    Model Performance

    MetricValue (95% CI)
    Sensitivity71.3 (69.0-73.5)
    Specificity73.5 (71.5-75.5)
    PPV69.3 (67.1-71.5)
    NPV75.3 (73.3-77.2)
    • NPV = negative predictive value; PPV = positive predictive value.

    • View popup
    Table 3.

    Subpopulation Performance

    MetricValue (95% CI)
    Sensitivity
        All71.3 (69.0-73.5)
    Gender
        Female74.0 (71.4-76.5)
        Male59.3 (54.0-64.4)
    Age, y
        <6071.9 (69.5-74.2)
        ≥6063.4 (54.3-71.9)
    Race/ethnicity
        Asian or Pacific Islander67.4 (60.7-73.7)
        Black or African American72.4 (64.0-79.8)
        Hispanic or Latine80.3 (72.6-86.6)
        White70.7 (67.7-73.5)
    Specificity
        All73.5 (71.5-75.5)
    Gender
        Female68.9 (66.2-71.4)
        Male83.9 (80.8-86.7)
    Age, y
        <6071.8 (69.6-73.9)
        ≥6086.8 (81.6-91.0)
    Race/ethnicity
        Asian or Pacific Islander77.5 (72.8-81.8)
        Black or African American75.9 (69.3-81.7)
        Hispanic or Latine68.6 (60.1-76.3)
        White72.8 (70.0-75.4)

Additional Files

  • Figures
  • Tables
  • PLAIN-LANGUAGE SUMMARY

    Original Research


    AI-Based Voice Biomarker Tool Shows Promise in Detecting Moderate to Severe Depression

    Background and Goal:Depression is a leading cause of disability, impacting an estimated 18 million Americans each year, with a lifetime prevalence of major depression approaching 30%. Despite recommendations for universal screening, depression screening rarely occurs in the outpatient setting with some estimates placing screening rates at less than 4% of primary care encounters. This study evaluated an AI-based machine learning biomarker tool that uses speech patterns to detect moderate to severe depression, aiming to improve access to screening in primary care settings.

    Study Approach: The study analyzed over 14,000 voice samples from U.S. and Canadian adults. Participants answered the question, “How was your day?” with at least 25 seconds of free-form speech. The tool analyzed vocal biomarkers associated with depression, including speech cadence, hesitations, pauses, and other acoustic features. These were compared to results from the Patient Health Questionnaire-9 (PHQ-9), a standard depression screening tool. A PHQ-9 score of 10 or higher indicated moderate to severe depression. The AI tool provided three outputs: Signs of Depression Detected, Signs of Depression Not Detected, and Further Evaluation Recommended (for uncertain cases).

    Main Results:The dataset used to train the AI model consisted of 10,442 samples, while an additional 4,456 samples were used in a validation set to assess its accuracy. 

    • The tool demonstrated a sensitivity of 71%, meaning it correctly identified depression in 71% of people who had it.

    • Specificity was 74%, indicating that the tool correctly ruled out depression in 74% of people who did not have it.

    • In about 20% of cases, the tool flagged results as uncertain, recommending further evaluation by a clinician.

    Why It Matters: While not a replacement for formal clinical interviews or assessments by qualified clinicians, the study findings suggest that machine learning technology could serve as a complementary decision-support tool. These findings are preliminary, and more work is needed to validate the tool and explore its integration into primary care workflows. This study represents a promising avenue for using physiologic voice biomarkers to assist clinicians in identifying and addressing depression, with future research needed to refine the technology and assess its broader applicability.


    Evaluation of an AI-Based Voice Biomarker Tool to Detect Signals Consistent With Moderate to Severe Depression


    Alexa Mazur, BA, et al  

    Kintsugi Mindful Wellness, Inc, San Francisco, California

    Visual Abstract:


  • SUPPLEMENTAL MATERIALS IN PDF FILE BELOW

    • Mazur_Supplemental_Materials.pdf -

      PDF FILE

  • VISUAL ABSTRACT IN PDF FILE BELOW

    • Mazur_Visual_Abstract.pdf -

      PDF file

PreviousNext
Back to top

In this issue

The Annals of Family Medicine: 23 (1)
The Annals of Family Medicine: 23 (1)
Vol. 23, Issue 1
January/February 2025
  • Table of Contents
  • Index by author
  • Front Matter (PDF)
  • Plain-Language Summaries
Print
Download PDF
Article Alerts
Sign In to Email Alerts with your Email Address
Email Article

Thank you for your interest in spreading the word on Annals of Family Medicine.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Evaluation of an AI-Based Voice Biomarker Tool to Detect Signals Consistent With Moderate to Severe Depression
(Your Name) has sent you a message from Annals of Family Medicine
(Your Name) thought you would like to see the Annals of Family Medicine web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
2 + 2 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
Citation Tools
Evaluation of an AI-Based Voice Biomarker Tool to Detect Signals Consistent With Moderate to Severe Depression
Alexa Mazur, Harrison Costantino, Prentice Tom, Michael P. Wilson, Ronald G. Thompson
The Annals of Family Medicine Jan 2025, 23 (1) 60-65; DOI: 10.1370/afm.240091

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Get Permissions
Share
Evaluation of an AI-Based Voice Biomarker Tool to Detect Signals Consistent With Moderate to Severe Depression
Alexa Mazur, Harrison Costantino, Prentice Tom, Michael P. Wilson, Ronald G. Thompson
The Annals of Family Medicine Jan 2025, 23 (1) 60-65; DOI: 10.1370/afm.240091
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • INTRODUCTION
    • METHODS
    • RESULTS
    • DISCUSSION
    • Acknowledgments
    • Footnotes
    • References
  • Figures & Data
  • eLetters
  • Info & Metrics
  • PDF

Related Articles

  • PubMed
  • Google Scholar

Cited By...

  • No citing articles found.
  • Google Scholar

More in this TOC Section

  • Treatment of Chlamydia and Gonorrhea in Primary Care and Its Patient-Level Variation: An American Family Cohort Study
  • Performance-Based Reimbursement, Illegitimate Tasks, Moral Distress, and Quality Care in Primary Care: A Mediation Model of Longitudinal Data
  • Adverse Outcomes Associated With Inhaled Corticosteroid Use in Individuals With Chronic Obstructive Pulmonary Disease
Show more Original Research

Similar Articles

Keywords

  • machine learning
  • artificial intelligence
  • depression
  • voice biomarkers

Content

  • Current Issue
  • Past Issues
  • Early Access
  • Plain-Language Summaries
  • Multimedia
  • Podcast
  • Articles by Type
  • Articles by Subject
  • Supplements
  • Calls for Papers

Info for

  • Authors
  • Reviewers
  • Job Seekers
  • Media

Engage

  • E-mail Alerts
  • e-Letters (Comments)
  • RSS
  • Journal Club
  • Submit a Manuscript
  • Subscribe
  • Family Medicine Careers

About

  • About Us
  • Editorial Board & Staff
  • Sponsoring Organizations
  • Copyrights & Permissions
  • Contact Us
  • eLetter/Comments Policy

© 2025 Annals of Family Medicine