Abstract
PURPOSE Positive psychology shows promise in improving positive affect and happiness. We tested a digital version of a positive psychology intervention called Three Good Things (3GT) among health care workers to assess whether gratitude practice improved well-being.
METHODS All members of a large academic medicine department were invited. Participants were randomized to an immediate intervention group or control group (delayed intervention). Participants completed outcome measures surveys (demographics, depression, positive affect, gratitude, and life satisfaction) at baseline, and at 1 month and 3 months post-intervention. Controls completed additional surveys at 4 and 6 months (completion of the delayed intervention). During the intervention, we sent 3 text messages per week asking for 3GT that occurred that day. We used linear mixed models to compare the groups and to look at the effects of department role, sex, age, and time on outcomes.
RESULTS Of 468 eligible individuals, 223 (48%) enrolled and were randomized with high retention through the end of the study. Most (87%) identified as female. For the intervention group, positive affect improved slightly at 1 month, then declined slightly but remained significantly improved at 3 months. Depression, gratitude, and life satisfaction scores showed a similar trend but were not statistically different between groups.
CONCLUSIONS Our research showed adherence to a positive psychology intervention for health care workers created small positive improvements immediately post-intervention but were not sustained. Further work should evaluate whether utilizing different duration or intensity of the intervention improves benefits.
- health care workers
- psychology, positive
- physicians, primary care
- psychological well-being
- randomized controlled trial
INTRODUCTION
There are well-documented global concerns about mental distress among physicians1 and nurses.2 The COVID-19 pandemic increased rates of burnout among all members of the health care workforce, up to 52% found in a recent meta-analysis.3 Although numerous studies have reported on prevalence of distress, high-quality intervention research is limited. Burnout solutions must address structural issues within health care,4,5 but individual interventions may offer some benefit in terms of mental health and positive affect.
Positive psychology emerged as a field in the late 1990s with a particular interest in how positive actions such as gratitude, focusing on positive memories, and identifying personal strengths can improve overall happiness, well-being, mental health, and physical health outcomes.6-11 Gratitude interventions have been adapted across numerous sectors, including the military,12 education,13 and athletics.14
We focused on a specific gratitude intervention called Three Good Things (3GT), developed in 2005 and tested in a variety of health settings with patients and employees.6,15,16 The intervention instructs individuals to write down 3 positive things that happened during the day and to consider their role in these events. While the initial 3GT study by Seligman et al6 ran daily for a week, some studies have extended it to daily for 6 months.16 Within health care, 3GT has shown benefits in reducing emotional exhaustion,16,17 depressive symptoms,17,18 work-life balance problems,19 and intent to leave a job, among other outcomes.20
We wondered whether 3GT could be adapted for busy health care workers to be digital and less frequent and whether it could improve well-being. We conducted a randomized controlled trial to test potential benefits of 3GT on mood, positive affect, gratitude, and satisfaction with life.
METHODS
Recruitment and Randomization
We sent recruitment e-mails to all staff, faculty, residents, and fellows in the department of family medicine within a large academic medical center. To enroll, participants clicked on a Qualtrics (Qualtrics International Inc) survey link and entered a telephone number to receive texts. One author (K.J.G.) used Excel (Microsoft Corp) to automatically randomize telephone numbers 1:1 to Group A for immediate intervention, or Group B for delayed intervention. The research team could see how telephone numbers were randomized to a group but had no identifying information about participants. Participants were not blinded to group assignment.
Intervention
We programmed telephone numbers into a texting program and on Monday, Wednesday, and Friday evenings for 3 weeks participants were automatically texted a Qualtrics survey link. The link invited them to list 3 good things that occurred that day and to consider their role in the events. Non-respondents were sent a follow-up text the next evening (Tuesday, Thursday, Saturday). Group A started the intervention immediately (month 0), and Group B started the intervention after completing the survey at month 3 (Figure 1).
Study Design
We sent a link to an online survey at months 0, 1, and 3 for Group A (3 surveys total) and at months 0, 1, 3, 4, and 6 for Group B (5 surveys total) with up to 2 follow-up reminders for nonresponders. This helped us assess whether there was a carryover effect beyond the immediate time of the intervention. The study was a randomized controlled trial with an intervention group and a placebo group for the first one-half of the study. Allowing participants in the placebo group to do a delayed intervention meant everyone could participate in the wellness intervention, and provided more participants for the pre- and post-analysis. Participants completing a minimum number of surveys and texts were eligible for a drawing for 1 of 35 gift cards (worth $100) that were funded by a small grant from our institutional Wellness Office. The study ran from February through August 2021, ending after both groups completed the intervention and surveys. It was approved by the University of Michigan Institutional Review Board and registered on ClinicalTrials.gov.
Outcome Measures
The pre-intervention survey included age, sex, department role (administrative staff, clinical staff, faculty/fellow/resident), and self-report of physical and mental health on a 4-point scale (poor, fair, good, excellent).21 The post-intervention survey included 2 additional questions on time spent on responses and satisfaction with the intervention. All surveys included 4 validated and reliable well-being measures selected for their brevity and wide use in prior positive psychology research.
The outcome measures used were: (1) the 9-item Patient Health Questionnaire (PHQ9), widely used in primary care to assess depression with scores ranging from 0 (no depression) to 27 (severe depression) and scores of 10 or more indicating possible depression22,23; (2) the 10-item positive affect subscale from the Positive and Negative Affect Schedule-Short Form (PANAS-SF), a valid and reliable scale24-26 with scores ranging from 10 to 50 and higher scores indicating more positive affect; (3) the 3-item Gratitude Adjective Checklist (GAC) which includes self-ratings of gratefulness, thankfulness, and appreciation over the last 2 weeks with total score ranging from 3 to 15 and higher scores indicating greater gratitude27,28; and (4) the 5-item Satisfaction with Life Questionnaire (SAT) which includes 7-point Likert-style questions with total scores ranging from 5 (less satisfaction) to 35 (more satisfaction).29,30
Sample Size
This pilot project was originally conceived and funded as a wellness project for members of a single department. Thus, we started with a set sample size and did not conduct a pre-study power analysis. Post-hoc analysis showed that we would need 458 participants per group to detect a 1-point change in PHQ9 score with 80% power and an α level of 0.05.31
Analysis
For our primary analysis, we compared outcomes for Group A (intervention) and Group B (control). We used χ2 and Fisher’s Exact test to compare self-rated mental/physical health and demographics across the 2 groups. Given the longitudinal assessment and use of repeated measures for each participant at months 0, 1, and 3, we used linear mixed models with our 4 well-being measures as dependent variables. Group, time (months), and group-by-time interaction were used as primary fixed factors. The interaction term helped us evaluate whether the intervention led to greater differences between groups at 1 time point than another. We controlled for age, sex, and role in the department. Sex (male vs female) was used rather than gender due to small subgroup size (n = 1) for non-binary individuals. A random participant level intercept accounted for the clustering within participant—that is, scores vary not only between Group A and Group B but scores from a single participant in Group A at 1 time point are also likely related to their own scores at other time points. Data included 15 (6.7%) people missing values for age, sex, and role, 2 (0.8%) people missing role only, and 1 (0.4%) person missing sex only.
In a secondary analysis, to look at change in each participant over time, we utilized a linear mixed model framework using time (baseline, early assessment, late assessment) as the within-participant factor, and age, sex, and role in the department as the between-participant covariates. We defined baseline, early assessment, and late assessment as months 0, 1, and 3 for Group A and months 3, 4, and 6 for Group B. A random participant-level intercept accounted for the intra-participant correlation. We report estimated mean and standard error obtained from the mixed regression model for overall outcomes as well as variations by role, sex, and age.
RESULTS
Demographics
Of 468 eligible individuals within the department, 223 (48%) enrolled and were randomized to Group A (n = 116) or Group B (n = 107) (Figure 2). The month 0 survey was completed by 107 individuals in Group A and 101 individuals in Group B. The late survey was completed by 92 participants of Group A at month 3, and by 82 participants of Group B at month 6 (Figure 2). Of those completing the baseline survey, 87% in Group A and 81% in Group B were retained through study completion. There was no difference in dropout rates between groups. Dropouts were younger on average (37 years vs 42 years, P = .02) and twice as likely to report poor or fair baseline mental health (22% of dropouts vs 11% of completers, P = .04) but not different by sex, role, or scores on the 4 well-being surveys.
Respondents included 39 (19%) administrative staff, 84 (41%) clinical staff, and 83 (40%) faculty, fellows, or residents. Sex was 27 (13%) male, 180 (87%) female, and 1 (0.5%) non-binary/other individual (excluded for multivariable analysis). Median age was 41 years with a range of 22 to 72 years. Demographic characteristics did not vary significantly between groups (Table 1).
Results of Randomized Controlled Trial
Group A (intervention) and Group B (control) showed no significant differences in depression, gratitude, or satisfaction with life scores at months 0, 1, or 3 (Figure 3). For depression and gratitude, scores in the intervention group were favorable immediately after the intervention but gains were mostly lost by month 3 and were not significant. Measures of positive affect were significantly different between groups over time (group-by-time interaction P = .03), particularly in the first month when the intervention group had more than a 2-point jump in scores (vs 0.25 for the control group) that was statistically significant at the 0.05 level. However, gains mostly disappeared by month 3. There were no differences in self-reported mental and physical health ratings between groups. Data for the mixed linear model showing changes at each time point are in Supplemental Table 1.
Results of Pre- and Post-Analysis
Using our linear mixed model, we compared pre-intervention (baseline) and post-intervention (early and late) for all participants while controlling for age, role, and sex (Supplemental Table 2). PHQ-9 scores dropped (improved) from baseline to the early assessment (P = .012) and rose nonsignificantly by the late assessment. Comparison between baseline and late assessments, however, showed improvement (P = .035), particularly for men (P = .012). For females, we saw a bounce back effect; the mean PHQ-9 at the early assessment was significantly lower than at baseline (P = .02) and the late assessment (P = .01) with the baseline and late averages being nearly identical. A similar pattern emerged for positive affect and gratitude scores in women which jumped up significantly from baseline to the early assessment but dropped back near baseline by the late assessment and were no longer significantly different. The trend was similar in males but no differences were significant. Group changes in satisfaction with life did not vary at any assessments. Increasing age was associated with small but significant improvements in positive affect (P = .008) and gratitude scores (P = .046), and a small decrease in depression scores (P = .018). Older age was associated with a slight drop in life satisfaction score (P = .029).
Comparing the time-averaged effects across roles, we found no significant differences for PHQ9 scores. The gratitude averages were lowest among clinical staff, with significant differences both with the faculty/trainees (P = .008) and the administrative staff (P = .02). Similar patterns emerged for positive affect scores although the difference crosses the threshold for significance only for clinical staff vs faculty (P = .01). Average satisfaction scores were lowest among faculty/trainees with significant differences for both the administrative staff (P = .01) and the clinical staff (P <.001). No other differences were statistically significant.
DISCUSSION
The 3-week 3GT digital intervention did not show statistically different well-being outcomes although there were short-term benefits for all participants over time, with positive affect scores still higher by the 3-month follow-up.
Other studies have shown similar results, an immediate bump in well-being outcomes which attenuate over time.32 Among health workers in a 15-day 3GT intervention, benefits peaked at 1 month but declined to near baseline levels at 6 and 12 months.19 Similar results were reported in studies of 3GT for insomnia,33 3GT for middle-aged women,34 and a recent meta-analysis of 336 papers about positive-psychology interventions.35 Our 3-week, 3-times-per-week intervention was longer than the original study (daily for 1 week)6,7 but shorter than a 2020 study which ran daily for 6 months.20 While the original 3GT study by Seligman et al6 found persistent improvements at 6 months, benefits were limited to people who reported continuing the practices on their own after the week-long intervention ended. Intervention dose matters; in 1 systematic review, 90% of studies with daily participation or 3 to 5 times per week showed benefit compared with the 25% of studies requiring once-weekly participation.10 Four-week interventions showed better results than shorter studies.36
Newer meta-analyses of positive psychology interventions have started to weight studies by size which has shown substantially less effect magnitude than early reviews reported.37 Small study bias exists in positive psychology; smaller studies show strong effects (due to larger standard errors) which are tempered in large studies.8
We had excellent retention (similar to the original 7-day study)6 which is important as many 3GT studies have high drop-out rates and limited participation. For example, in a 15-day 3GT trial only about one-half of individuals completed the outcome surveys at months 1, 6, and 12 and, on average, participants completed just 9 of the 15 days of the intervention.19 A 7-day trial among middle-aged women had just 32% participation by the 6-month follow-up38 and a meta-analysis of positive psychology interventions at worksites noted average adherence of 45%.39 We believe that asking staff to only participate 3 days per week rather than daily, sending reminders to nonrespondents, and gift card incentives all substantially improved retention. One systematic review noted better outcomes with shorter (6- to 7- week) interventions, those engaging participants with e-mails or texts, and those incorporating persuasive technology (eg, tailoring to participants).39 These attributes should guide future research.
It is unclear whether gratitude interventions would be more effective with a targeted population (eg, individuals with high baseline depression scores). Some studies have excluded people with high levels of distress.10 This is an area for further exploration.
Drivers of health worker burnout are related to structural problems within the health care system and will require institutional fixes. Interventions like 3GT may have value as low-cost and simple strategies to improve positive feelings but cannot address burnout alone. However, promoting gratitude as a social norm could theoretically lead to changes in culture or practice while promoting joy in the workplace—benefiting both employees and patients.40,41
Limitations
Despite good participation and retention, the study was conducted in a single academic institution which limits generalizability. As the intervention was open to anyone in a single department, we did not complete a pre-study power analysis. Post-hoc power calculations showed we needed groups 4 times larger than our departmental trial to detect small mood changes; this will inform future trials. Different outcome measures might have given different results; we chose measures based on the gratitude literature, but these may not be outcomes most relevant to health care workers. The well-being outcomes could have been influenced by changing world events and COVID-19 surges during the study period rather than our intervention. We did find small immediate benefits from the intervention, but gains had generally dissipated by 3 months, other than a small but persistent bump in positive affect. While some subgroups had small significant outcomes, the effect is minimal and may not be clinically significant. Participants who did not complete the study through month 3 reported worse self-rated mental health at baseline but did not have significantly different PHQ-9 scores. This might reflect unmeasured conditions such as anxiety or burnout which impacted adherence. As we lack mental health measures on department members who did not participate in the study, we cannot compare participants with nonrespondents. Engagement and retention was likely strengthened by the modest gift card incentives.
We used a texting platform but all responses had to be reviewed by the research team to identify nonrespondents. A platform which permits full automation would reduce work of implementation. Trialing modifications in future studies such as increasing the dose (frequency of text messages or duration of study) or sending follow-up texts prompting people to think about 3GT beyond the end of the main intervention might also be useful as these appear to have fueled the long-term benefits in the initial study of 3GT by Seligman et al.6 Adding additional positive and supportive texts has shown benefits in studies with nurses and might enhance a future intervention.42
CONCLUSIONS
Given heightened awareness about health care worker distress, efforts to improve well-being are essential. Although this study showed a small boost directly after the intervention, there was limited demonstration of long-term benefit. There may be substantial benefits for subgroups such that a more tailored application of 3GT might have stronger effects. Given that the intervention had good acceptance and adherence, was low-cost, low-risk, and would be easy to implement if fully automated, it is worth additional study.
Footnotes
Conflicts of interest: authors report none.
Funding support: Funding was provided by the University of Michigan Wellness Office, Workplace Wellbeing Grant Program.
Trial registration: NCT04600076 (ClinicalTrials.Gov)
- Received for publication May 12, 2022.
- Revision received November 30, 2022.
- Accepted for publication December 13, 2022.
- © 2023 Annals of Family Medicine, Inc.