Abstract
PURPOSE Identifying cardiovascular disease before conception and in early pregnancy can better inform obstetric cardiovascular care. Our main objective was to evaluate the diagnostic performance of artificial intelligence (AI)-enabled digital tools for detecting left ventricular systolic dysfunction (LVSD) among women of reproductive age.
METHODS In a pilot cross-sectional study, we enrolled an initial cohort of 100 consecutive women aged 18-49 years who had a primary care physician and a scheduled echocardiography at Mayo Clinic Florida (Jacksonville) (cohort 1). Twelve-lead electrocardiography (ECG) and digital stethoscope recordings (single-lead ECG + phonocardiography) were performed on the date of echocardiography. We used deep learning to generate prediction probabilities for LVSD (defined as left ventricular ejection fraction <50%) for the 12-lead ECG (AI-ECG) and stethoscope (AI-stethoscope) recordings. In a second cohort of 100 participants, we enrolled consecutive women seen in primary care to estimate the prevalence of positive AI screening results when deployed for routine use (cohort 2).
RESULTS The median age of participants was 38.6 years (quartile 1: 30.3 years, quartile 3: 45.5 years), and 71.9% identified as part of the non-Hispanic White population. Among cohort 1, 5% had LVSD. The AI-ECG had an area under the curve of 0.94, and the AI-stethoscope (maximum prediction across all chest locations) had an area under the curve of 0.98. Among cohort 2, the prevalence of a positive AI screen was 1% and 3.2% for AI-ECG and the AI-stethoscope, respectively.
CONCLUSION We found these AI tools to be effective for the detection of cardiomyopathy associated with LVSD among women of reproductive age. These tools could potentially be useful for preconception cardiovascular evaluations.
INTRODUCTION
Cardiomyopathy is a major health threat during pregnancy1,2 and accounts for 40% to 60% of late maternal deaths (43-365 days postpartum).1,3-5 Maternal death rates in the United States have unfortunately continued to increase over the past several years and are notably worse compared with other developed countries.6 It is estimated that more than 80% of maternal deaths are preventable.7,8 Current guidelines recommend that interventions targeted at decreasing maternal mortality be considered at multiple time points including preconception, during pregnancy, and the postpartum period.9
Some forms of pregnancy-related cardiomyopathy, associated with left ventricular systolic dysfunction (LVSD), are believed to develop during late pregnancy or postpartum. These include peripartum cardiomyopathy, characterized by a decrease in left ventricular ejection fraction (LVEF) to <45%,10 and ischemic cardiomyopathy secondary to pregnancy-related spontaneous coronary artery dissection.11 Other forms (eg, preexisting left ventricular dysfunction) might be undiagnosed before conception and become unmasked in late pregnancy or postpartum.12 In fact, the prevalence of asymptomatic LVSD in the general population is estimated to be 1% to 2%.13 Many young women do not undergo preconception evaluation, and approximately 42% of pregnancies in the United States are unplanned.14 Given the increasing prevalence of maternal mortality in the United States, largely driven by cardiovascular conditions,1,2 there is a clear need to incorporate cardiovascular screening as part of routine primary care for young women. In addition, preconception counseling and evaluation should integrate cardiovascular screening to address this critical health issue.
A deep learning algorithm, a form of artificial intelligence (AI), has shown effectiveness in identifying LVSD using data obtained during electrocardiography (ECG).15-18 This algorithm has also been evaluated retrospectively among pregnant and postpartum women,19 and prospectively among adults seen in the primary care setting13 (mean age 61 years) in a clinical trial, and found to improve the diagnosis of LVSD. The main objective of the present study was to evaluate the diagnostic performance of AI-enhanced 12-lead ECG (AI-ECG) and AI-enabled portable ECG + phonocardiography recorded with a digital stethoscope (AI-stethoscope) as potential tools for preconception LVSD screening among women of reproductive age seen in primary care.
METHODS
Study Design and Population
The study was designed as a cross-sectional observational pilot study. Inclusion criteria were individuals aged 18-49 years, who identified as women, seen in any of the primary care clinics at Mayo Clinic Florida from October 29, 2021 to August 25, 2022. The study was approved by the Mayo Clinic Institutional Review Board, and all participants provided informed consent.
Study participants were asked at screening if they were contemplating pregnancy or having a child soon. Those currently pregnant or ≤12 months postpartum were also allowed to participate. We excluded those who did not undergo study-related tests. Consecutive participants were enrolled in 2 cohorts. Cohort 1 participants were primary care clinic patients who were already scheduled/referred for an echocardiogram as part of their ongoing medical care, and cohort 2 participants were recruited during primary care clinic visit appointments. We used the Standards for Reporting of Diagnostic Accuracy Studies (STARD 2015) guideline for reporting the study design and results (Supplemental Appendix).
Measures
All participants underwent ECG recordings with a standard 10-second 12-lead ECG and a digital stethoscope capturing 15-second single-lead ECG + phonocardiography for up to 3 locations across the chest wall (Figure 1). The time taken to perform both recordings, including placement and removal of the 12-lead ECG electrodes and digital stethoscope recording for 3 chest locations, was approximately 5 minutes.
Digital Stethoscope Recording Positions
For cohort 1, participants were identified before their scheduled echocardiography appointment and contacted by telephone, e-mail, or electronic health record portal message. After providing informed consent, they were scheduled for a standard 12-lead ECG as well as portable single-lead ECG + phonocardiograms with a digital stethoscope on the same day as echocardiography. Twelve-lead ECGs were performed in the supine position by a hospital ECG technician or research study staff in keeping with standard clinical protocols. Portable ECG + phonocardiography was performed in a sitting or supine position with an Eko DUO digital stethoscope (Eko Health Inc) placed on 3 chest locations (V2 [approximate position of V2 electrode during acquisition of a standard 12-lead ECG], angled, and subclavicular) (Figure 1). Electrocardiography and digital stethoscope recordings were performed before or immediately after echocardiogram image acquisition. The echocardiogram results were not available to study staff at the time of ECG acquisition and digital stethoscope recording.
Transthoracic echocardiogram images were acquired by a trained sonographer and images interpreted by a cardiologist using the American Society of Echocardiography guidelines20 in accordance with standard clinical protocols. The AI prediction results were not available to the cardiologist interpreting the echocardiography images. We included data from standard transthoracic echocardiography and stress echocardiography, with LVEF measurements obtained from resting images alone. The goal was to evaluate the performance of the AI-ECG and AI-stethoscope models in predicting LVSD, and the primary study outcome was the detection of cardiomyopathy associated with LVSD, defined as LVEF <50% in cohort 1. This definition is based on the most recent universal definition and classification of heart failure and includes heart failure with mildly reduced ejection fraction (LVEF 41%-49%) and heart failure with reduced ejection fraction (LVEF <40%).21 The criterion standard test was LVEF obtained from 2-dimensional transthoracic echocardiography.
For participants in cohort 2, positive AI-ECG prediction results were made available to their primary care clinician, and echocardiography was recommended. For this cohort, our objective was to evaluate the prevalence of positive AI screening results (indicating a likelihood of LVSD), necessitating echocardiography, among women of reproductive age seen in the primary care clinic.
Artificial Intelligence Models
We used 2 previously developed AI models (based on deep learning), one using AI-ECG and the other using the AI-stethoscope. The AI-ECG model is a convolutional neural network trained using Keras and TensorFlow (Google) drawing on data from 98,000 unique patients. Details of model training and architecture have been published and validated for different patient populations and clinical settings.15,22,23 The AI-stethoscope model, adapted from the AI-ECG model and retrained to use single-lead ECG as input, has been validated24,25 and modified to incorporate phonocardiography.26 No additional model training or refinement was performed in the present study.
Statistical Analysis
Numeric variables are presented as median (interquartile range) and categorical variables as number (%). We conducted bivariate analyses using the nonparametric Wilcoxon rank sum test and Fisher exact test, as appropriate. We assessed the performance of the AI-ECG and AI-stethoscope models by calculating the area under the receiver operating characteristic curve (AUC) in addition to other measures of diagnostic accuracy—sensitivity, specificity, positive predictive value, and negative predictive value, along with associated 95% CIs. The following previously determined prediction probability thresholds were used: 0.256 for AI-ECG and 0.430 for the AI-stethoscope.
To evaluate for the presence of positive AI-screens in the study sample, the prevalence of a positive AI screen for LVSD was determined for the 12-lead ECG and digital stethoscope in cohort 2. Given that cohort 1 participants underwent echocardiography as part of ongoing clinical care, we assumed cohort 1 to be a higher-risk sample for which cardiomyopathy prevalence might not be reflective of that of the general clinic patient population. We also evaluated the agreement between positive and negative AI prediction probabilities obtained from the standard 12-lead ECG and digital stethoscope. Missing or poor-quality data were excluded from analysis. We considered a P value <.05 to be statistically significant, and all analyses were conducted using R version 4.2.2 (R Project for Statistical Computing).
RESULTS
A total of 200 women were recruited (100 in each cohort) (Figure 2). Table 1 summarizes the demographic and clinical characteristics of participants in each cohort. The median age of participants in cohort 1 was 40.5 years (quartile 1: 32.5 years, quartile 3: 45.7 years), and 73% of participants identified as White, 17% as Black, and 9% as Hispanic or Latino. One participant identified as a transgender woman, and all others identified as cisgender women. A total of 5%, 4%, and 2% had an LVEF <50%, <45%, and ≤35%, respectively. At the time of enrollment, 3 participants were pregnant, 2 were ≤12 months postpartum, and 95 were not pregnant. Pregnancy status was based on participant reports alone. Table 2 lists echocardiographic parameters for cohort 1 stratified by LVEF status. Women with LVSD (LVEF <50%) tended to have a greater body mass index compared with those without LVSD (34.3 vs 27.2). Left ventricular dimensions and mitral valve E/e` ratio were significantly increased among women with LVSD.
Study Flow Diagram
Demographic and Clinical Characteristics of Study Sample
Echocardiographic Parameters Stratified by Left Ventricular Ejection Fraction Among Participants in Cohort 1
Performance of AI-ECG Model (Based on 12-Lead ECG)
The AI-ECG model identified LVSD with an AUC of 0.94 (95% CI, 0.88-1.00) for LVEF <50% (Table 3, Figure 3). Sensitivity, specificity, positive predictive value, and negative predictive value were 40%, 96%, 33%, and 97%, respectively. A sensitivity analysis evaluating the performance of AI-ECG for detection of LVEF <45% is provided in Supplemental Table 1.
Measures of Diagnostic Accuracy for AI Models for Cardiomyopathy Detection Based on Standard 12-Lead ECG and Digital Stethoscope Recordings
Receiver Operating Characteristic Curve and Confusion Matrix for 12-Lead ECG
AI = artificial intelligence; AUC = area under the curve; ECG = electrocardiography; ROC = receiver operating characteristic curve.
Note: The panel on the left shows the ROC curve and diagnostic performance metrics of the AI-ECG model based on 12-lead ECGs. Data are presented as % (95% CI). The panel on the right shows the associated confusion matrix comparing dichotomous AI prediction results with the ground truth (ECG results).
Performance of AI-Stethoscope Model (Based on Single-Lead ECG and Phonocardiogram)
Across all stethoscope positions, the highest numeric performance for an individual stethoscope recording location was seen for the angled position (threshold value 0.430), with an AUC of 0.98 (95% CI, 0.96-1.00) for LVSD detection (Table 3). Sensitivity, specificity, positive predictive value, and negative predictive value were 80%, 94%, 40%, and 99%, respectively. Using the maximum prediction (ie, taking into account all stethoscope positions and using the greatest prediction probability obtained to simulate how this device would be used in routine clinical care), the AUC was 0.98 (95% CI, 0.95-1.00). Sensitivity, specificity, positive predictive value, and negative predictive value were 100%, 82%, 23%, and 100%, respectively (Table 3, Figure 4). Sensitivity analyses evaluating the performance of the AI-stethoscope for detection of LVEF <45% and accounting for missing or poor-quality recordings are provided in Supplemental Tables 1 and 2, respectively.
Receiver Operating Characteristic Curve and Confusion Matrix for Digital Stethoscope (Maximum Prediction)
AI = artificial intelligence; AUC = area under the curve; ECG = electrocardiography; ROC = receiver operating characteristic curve.
Note: The panel on the left shows the ROC curve and diagnostic performance metrics of the AI-stethoscope model (maximum prediction). Data are presented as % (95% CI). The panel on the right shows the associated confusion matrix comparing dichotomous AI-stethoscope prediction results with the ground truth (ECG results).
Prevalence of Positive Artificial Intelligence Screen
We found the prevalence of a positive AI screen for LVSD to be low for cohort 2, at 1% (1/100) based on the 12-lead ECG, 3.2% (3/95) based on the AI-stethoscope prediction (maximum prediction), and 0% (0/93) for the angled position. Follow-up echocardiography performed approximately 12 weeks after the positive AI-ECG screen revealed a normal LVEF of 60%. Repeat 12-lead ECG performed at the time of echocardiography had a negative AI screen for LVSD. For cohort 1, the prevalence of a positive AI screen was 6% (6/100) for 12-lead ECGs and 22% for AI-stethoscope maximum prediction (ie, any stethoscope position with a positive prediction was considered). Although this cohort was considered a higher-risk sample, given that they had already been referred for echocardiography by their clinician. Overall, the AI-stethoscope had a greater false-positive rate and sensitivity compared with AI-ECG, which had a lower false-positive rate and sensitivity.
Correlation Analysis/Agreement
In cohort 1, for which echocardiography results were available, we found that LVSD predictions from the AI-stethoscope showed fair agreement with predictions from the AI-ECG, specifically with the angled position. The agreement analysis showed a 78% concordance (K = 0.132) for the stethoscope maximum prediction (Supplemental Figure 1), whereas the angled stethoscope position showed 89% concordance (K = 0.324) (Supplemental Figure 2) in dichotomous predictions with 12-lead ECG. Overall, the AI-stethoscope had a greater rate of positive predictions for this cohort.
DISCUSSION
In this pilot validation study, we found that deep learning models using data from a standard clinical ECG and digital stethoscope (ECG + phonocardiography) recordings can potentially be used to detect LVSD among women of reproductive age, with AUC values of 0.94 and 0.98 respectively. We also found the prevalence of false-positive screens, using the 12-lead AI-ECG model, to be low in an unselected patient sample of women seen in the primary care setting.
In the United States, almost one-half of all pregnancies among women aged 15-44 years are unintended.14,27 As such, scheduling and attending a preconception clinic visit is not feasible for many women. Our findings suggest that incorporating AI-based cardiovascular screening into routine primary care for women of reproductive age can positively affect care with minimal drawbacks related to false-positive screens. In addition, the AI-ECG model has been shown to be cost effective.28,29 The present study represents an important validation of existing AI models in a unique patient population—women of reproductive age, who will benefit greatly from cardiovascular screening and evaluation before conception or in early pregnancy. It also offers a potential opportunity to address the unmet need for preconception cardiovascular screening.
In cohort 2, one participant (based on 12-lead ECG) and 3 participants (based on digital stethoscope recordings) had positive AI-screens for LVSD during the study. Potential explanations for the false-positive AI-ECG (based on 12-lead ECG) include possible LVEF recovery during the 12-week period after initial ECG or a true false positive due to model sensitivity to lead placement, given that the initial ECG was noted to have limb lead reversal on clinical review. Repeat AI-ECG performed on the date of echocardiography with correctly placed leads was negative. Whereas the patient identified by the 12-lead model indicated a false-positive screen (confirmed by subsequent echocardiography), it suggests a low likelihood that routinely screening women of reproductive age will generate a significant additional burden on primary care physicians to order additional testing. This prevalence is also in keeping with the literature-reported prevalence of asymptomatic left ventricular dysfunction in a general patient population, ranging from 1% to 2%.13
It is important to note that how these AI tools are used in routine clinical practice can influence diagnostic performance. Whereas the digital stethoscope showed improved sensitivity for detection of left ventricular dysfunction over standard 12-lead ECG in cohort 1, the false-positive rate was also comparatively greater, keeping in mind that this was a higher-risk patient sample. The maximum prediction (if multiple locations are recorded) had the greatest false-positive rate (17%, 17/100) compared with using the angled location alone (6%, 6/99). The performance of the AI-stethoscope was most optimal at the angled position (in keeping with US Food and Drug Administration approval26), showing a relatively low false-positive rate in the high-risk sample (cohort 1) and no false positives detected in the lower-risk sample (cohort 2). This indicates that the angled position might be ideal for screening (single chest recording), particularly when evaluating how best to implement the use of the AI-stethoscope in clinical practice. Pending larger validation studies, it might be important to consider further refinements to the 12-lead AI-ECG model before deployment for use in women of reproductive age, considering its demonstrated lower sensitivity in this study.
A key contributor to cardiovascular maternal mortality is delayed diagnosis,10 given that heart failure symptoms might be perceived or interpreted as normal pregnancy symptoms. This AI tool could potentially improve diagnosis, leading to earlier treatment and a decrease in adverse maternal outcomes. Whereas multiple studies and guidelines emphasize preconception counseling, assessment, planning, and care delivery,4,9 few interventions have been developed to target women before pregnancy, and preconception care remains underutilized.30-32 With recent technological advancements, AI tools are poised to revolutionize health care and practice with a huge potential to improve women’s health. The present study demonstrates this potential and is an important contribution to the literature in this field. In addition, this study shows that in various settings where clinicians might not have ready access to standard 12-lead ECG, AI tools using novel bedside technologies, such as a digital stethoscope, can achieve very good diagnostic performance. As such, this approach might provide an opportunity to decrease barriers to LVSD detection in nontraditional screening environments (outside of a primary care clinic). With the digital stethoscope, we also showed the optimal recording location to be the angled position, should a single recording be desired for efficiency.
Limitations of this study include a relatively small sample size, with study participants enrolled from a single center, which might limit generalizability of the findings. In addition, participants enrolled in cohort 1, who had echocardiography scheduled by their primary care clinicians, likely represent a higher-risk group and might not be reflective of the patient population seen in the primary care setting. Furthermore, the degree of concordance between the AI-ECG and AI-stethoscope results was modest, which might be related to differences in data input, specifically the number ECG leads and the incorporation of phonocardiography. Determining the most suitable screening tool, whether an individual tool or a combination of tools, for this unique population will require larger studies.
Nevertheless, this study provides important information regarding the performance of 2 AI models for LVSD screening among women of reproductive age. The addition of an AI-stethoscope supports the potential expansion of AI-based screening to low-resource settings and nonclinical environments. Members of our team have conducted a pilot study among pregnant women33 and a clinical trial evaluating the effectiveness of the models in an obstetric population in Nigeria (NCT05438576), known to have the greatest prevalence of peripartum cardiomyopathy worldwide.10 The results of these additional studies will provide additional data on the effectiveness of these AI tools among young women. Subsequent studies will focus on implementing AI-guided screening into routine primary care for women of reproductive age. This will include longitudinal monitoring to facilitate additional assessment in larger patient samples and ensure that model performance remains robust.
CONCLUSION
The use of AI-ECG and the AI-stethoscope appear to be effective in screening for LVSD among women of reproductive potential in a primary care clinic setting. These tools offer a rapid and cost-effective solution for preconception cardiovascular screening. Larger studies with diverse populations are needed to confirm these findings.
Acknowledgments
We would like to express our appreciation to Dr Jeffrey Wight for his invaluable contributions to this work, including expert advice during protocol development, administrative support during study execution, and helpful manuscript reviews. Manuscript illustrations created with BioRender (Science Suite Inc).
Footnotes
Annals Early Access article
Conflicts of interest: The Mayo Clinic has licensed the underlying 12-lead electrocardiography technology to Eko Devices Inc, a maker of digital stethoscopes with embedded electrocardiogram electrodes, and to Anumana Inc. The Mayo Clinic may receive financial benefit from the use of this technology, but at no point will the Mayo Clinic benefit financially from its use for the care of patients at the Mayo Clinic. Z.I.A., F.L.J., and P.A.F. may also receive financial benefit from this agreement. All other authors report no conflicts of interest.
Funding support: This study was funded by the Department of Family Medicine, Mayo Clinic Florida. Dr Adedinsewo is supported by the Mayo Clinic Women’s Health Research Center and the Mayo Clinic Building Interdisciplinary Research Careers in Women’s Health Program funded by the National Institutes of Health (grant no. K12 HD065987). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Previous presentation: American College of Cardiology Annual Scientific Session & Expo Together With World Congress of Cardiology; March 4-6, 2023; New Orleans, Louisiana.
- Received for publication December 3, 2023.
- Revision received January 10, 2025.
- Accepted for publication February 4, 2025.
- © 2025 Annals of Family Medicine, Inc.