Abstract
PURPOSE The purpose of this study was to evaluate a primary care practice–based quality improvement (QI) intervention aimed at improving colorectal cancer screening rates.
METHODS The Supporting Colorectal Cancer Outcomes through Participatory Enhancements (SCOPE) study was a cluster randomized trial of New Jersey primary care practices. On-site facilitation and learning collaboratives were used to engage multiple stakeholders throughout the change process to identify and implement strategies to enhance colorectal cancer screening. Practices were analyzed using quantitative (medical records, surveys) and qualitative data (observations, interviews, and audio recordings) at baseline and a 12-month follow-up.
RESULTS Comparing intervention and control arms of the 23 participating practices did not yield statistically significant improvements in patients’ colorectal cancer screening rates. Qualitative analyses provide insights into practices’ QI implementation, including associations between how well leaders fostered team development and the extent to which team members felt psychologically safe. Successful QI implementation did not always translate into improved screening rates.
CONCLUSIONS Although single-target, incremental QI interventions can be effective, practice transformation requires enhanced organizational learning and change capacities. The SCOPE model of QI may not be an optimal strategy if short-term guideline concordant numerical gains are the goal. Advancing the knowledge base of QI interventions requires future reports to address how and why QI interventions work rather than simply measuring whether they work.
INTRODUCTION
Quality improvement (QI) approaches vary in the extent to which specific objectives, tools, resources, and change processes are provided and orchestrated by health systems or researchers. On one end of the spectrum, these features are externally imposed on participating organizations/subjects, such as providing physicians with flow sheets,1,2 checklists,3,4 or computer-based reminders,5–8 or distributing patient educational materials.6,9,10 Although such approaches can provide straightforward change mechanisms that ensure generalizability and treatment fidelity, they can pose problems when contextual variables contradict intervention fidelity11 or when motivation to sustain changes wanes once the researchers leave.12
On the other end of the spectrum are approaches where organizations/subjects engage in their own problem identification, and the processes for change emerge internally. These approaches move beyond filling a knowledge deficit on the part of patients or clinicians to enhancing the organization’s capacity and resources for change.13–17 Research on stakeholders—those individuals and groups who have an interest in and are influenced by the organization18—suggests that when stakeholders identify problems and generate their own solutions, they are more likely to engage in and sustain change processes.19 Without the engagement, motivation, and commitment of key stakeholders within an organization, even meritorious innovations may be abandoned before they have had the chance to be effective.20
We report the results of the Supporting Colorectal Cancer Outcomes through Participatory Enhancements (SCOPE) study, which combined features on both ends of this spectrum. The study imposed on participating primary care practices a specific goal—to improve colorectal cancer screening (CRC) rates—and a change process—a series of facilitated team meetings and learning collaboratives. The use of practice facilitators to guide QI efforts21 and learning collaboratives to stimulate cross-practice learning22–26 has received growing attention as robust methods for translating evidence-based guidelines into practice. Within these parameters, the study tailored the change process, allowing practice members to generate their own QI objectives and strategies in hopes of enhancing practices’ capacity for change.
METHODS
SCOPE was a cluster randomized trial designed to evaluate the effectiveness of a tailored intervention on CRC screening rates in primary care practices. The study design incorporated a mixed-methods evaluation to assess practice-level variation in intervention fidelity and experiences.27 CRC screening was selected because of its documented benefits for reducing morbidity and mortality, and its proven cost-effectiveness.28–31
The unit of randomization and intervention was the practice, whereas the unit of observation of outcomes was patients within each practice. SQUIRE32,33 and CONSORT34 guidelines served as a framework for implementing the intervention and reporting findings. The study was approved by the University of Medicine and Dentistry of New Jersey Institutional Review Board, and informed consent was obtained from participating practice members and patients.
Intervention
The 6-month intervention included 3 integrated components: a multimethod assessment process (MAP),27,35 a reflective adaptive process (RAP),35–37 and learning collaboratives.23–26,38–40 Key study personnel included 6 doctoral- or masters-level professionals who served as both qualitative researchers and QI facilitators. Most had experience in qualitative data collection methodologies and received facilitation training to ensure consistent implementation of the intervention. Most did not have expertise in cancer screening.
During the 3-day assessment, study personnel systematically observed practices and conducted interviews with clinicians and staff.27 Study personnel used an observation template to guide data collection and ensure consistency. After the MAP, study personnel shifted into facilitator mode and prompted the formation of a RAP team in each practice, which drove the practice’s CRC screening improvement efforts. RAP teams engaged in 2 cycles of meetings, with each cycle consisting of approximately 4 to 6 meetings. Although the facilitators guided the teams through the change process,37,41,42 decision making and QI work rested with the practice members.
The intervention also included 2 day-long learning collaboratives held after the first and second RAP cycles to foster cross-practice learning.25 Two representatives from each practice, including at least 1 physician, were requested to attend. The curriculum included a mix of didactic presentations from experts on cancer screening, cancer survivorship, and organizational change, followed by reflective discussions. Key points included the value of all recommended screening modalities, colonoscopy as the only method that can prevent CRC, and barriers to CRC screening.
Practice and Patient Sample
Power calculations indicated that for a 2-group t test of follow-up screening rates conducted at the .05 significance level, a sample of 24 practices evenly split between control and intervention groups with 30 patients per practice would give 90% power to detect an absolute increase of 75% in screening rates (from 31% to 54%). These calculations were based on estimates from previous data with an average baseline screening rate of 31% and an intracluster correlation (ICC) coefficient of 0.38.35
Practices were recruited from the New Jersey Primary Care Research Network, as well as the general population of primary care practices in New Jersey. Practices that agreed to participate were randomized to either the intervention arm or control arm of the study.
A consecutive sample (a type of nonprobability sampling that seeks to include all accessible and eligible subjects as part of the sample)43 of 30 patients aged 50 years or older was recruited from waiting rooms of each practice at baseline and the 12-month follow-up, constituting independent samples of patients. Descriptions of recruitment are published elsewhere.44 New patients and those who could not read or write English or Spanish were excluded. Patients were surveyed, and screening information on various cancers was extracted from their medical records.
Data for Quantitative Outcomes
CRC screening rates and physician recommendation for CRC screening were determined by medical record review.45 Trained chart auditors used a standardized chart abstraction tool, and interrater reliability analyses were conducted as part of ongoing quality checks. Patients were considered to be up-to-date on CRC screening if there was documentation of having received any tests in the recommended time period based on 2005 recommendations from the American Cancer Society: fecal occult blood test (FOBT) within 1 year, sigmoidoscopy or barium enema within 5 years, or colonoscopy within 10 years.46 Information was not collected on whether the tests were done for screening or for diagnosis of symptoms or abnormal physical findings. Because patients and practice members were not blinded to the focus of the study, we excluded data from the day of patient recruitment to minimize potential Hawthorne effects.
Statistical Analysis
Percentages summarized the distributions of patient and practice characteristics. The percentages of (1) patients for whom the practice met screening guidelines and (2) patients with appropriate screening or recommendation for screening in the medical chart were calculated at baseline and follow-up for each group (intervention and control). ICCs for these outcomes at baseline were calculated. An intent-to-treat analysis assessed the main effect of the intervention by comparing the odds of improvement for intervention practices with those for control practices. Specifically, within each group a Mantel-Haenszel common odds ratio was estimated stratifying by practice, thus accounting for clustering of responses within practices. A Z test was then used to assess whether the log-odds of improvement differed significantly between groups. A Breslow-Day test assessed homogeneity across practices in the odds of improvement within each group. Sensitivity analyses included 2-group t tests comparing the average improvement in screening rates between groups, measured within practice as a difference in proportion screened at follow-up minus that at baseline. This approach follows in principle that described by Donner and Klar47 for follow-up screening rates when not controlling for baseline. All analyses were conducted using SAS software (SAS Institute Inc).
Data for Qualitative Assessments
Qualitative data included MAP field notes and audio-taped RAP and learning collaborative meetings. Field notes of RAP meetings and learning collaboratives were written to capture elements not available from audio-recordings, such as group dynamics. Six- and 12-month follow-up visits were completed to assess longer term effects of the intervention. Data were de-identified to ensure confidentiality.
Qualitative Analysis
An immersion/crystallization technique was used to analyze the qualitative data.48 Descriptive case summaries were written for each practice and discussed in detail with the coauthors to identify initial patterns and themes. During this analytic process, 6 characteristics emerged as key contributing factors for the teams’ QI implementation: (1) team structure, defined as consistency of RAP team membership; (2) leadership, defined as how well formal practice leaders fostered team development and participated in QI efforts; (3) engagement, defined as participation by team members in the RAP meeting discussions and QI efforts; (4) psychological safety, defined as evidence of interpersonal risk-taking, such as voicing dissenting opinions or critical perspectives on QI efforts; (5) intracommunication, defined as communication among RAP team members regarding QI efforts; and (6) intercommunication, defined as communication between the RAP team and the rest of the practice regarding QI efforts. Each practice was then ranked along a continuum of strong, moderate, or weak on each characteristic. Implementation characteristics were explored using a comparative case study analysis. Any discrepancies in how the coauthors interpreted the findings were discussed to reach consensus.
RESULTS
Twenty-five practices consented to participate and were randomized to either the intervention arm (n = 12) or control arm (n = 13) (Supplemental Figure 1, available at http://annfammed.org/content/11/3/220/suppl/DC1). Early on, 2 practices closed (1 intervention, 1 control). To ensure an adequate intervention group sample, 1 control practice was randomly selected to be in the intervention group, thus providing a final sample size of 23 practices (12 intervention, 11 control). Of the 23 practices, the average number of physicians was 4 (min/max = 1 to 11). All were family or internal medicine practices, and only 1 was a residency practice (P16); 83% of practices were located in suburban settings. The average length of practice existence was 11.7 years.
Of the 12 intervention practices, 7 fully engaged in the intervention, 2 practices (P17 and P21) failed to participate in the intervention, and 3 others never fully engaged in developing collaborative processes as intended by the study (P7, P11, and P15) (Supplemental Table 1, available at http://annfammed.org/content/11/3/220/suppl/DC1).
At baseline, 80% (N = 791) of eligible patients consented to participate in the study; 67% (n = 723) of eligible patients participated at the 12-month follow-up (Supplemental Figure 2, available at http://annfammed.org/content/11/3/220/suppl/DC1). On average, 37% of patients had Medicare or Medicaid insurance. A total of 1,315 charts were audited for this study. Patient characteristics are presented in Table 1.
Quantitative Findings
Baseline CRC screening rates by practice ranged from 14% to 93%, with the average being 46%. At baseline, the outcomes (whether patients were appropriately screened or whether they received a screening recommendation and screening) had ICCs of 0.18 and 0.19, respectively.
The percentage of patients appropriately screened for CRC decreased among control practices (43% to 38%) and increased among intervention practices (49% to 53%). The percentage of patients screened or receiving physician recommendations decreased from 62% to 58% in control practices and increased from 67% to 71% among intervention practices. These differences were not statistically significant, however (Tables 2 and 3).
Within the treatment arm, practices were heterogeneous with respect to changes in odds of screening (Breslow-Day test P=.001 and <.001 for control and intervention practices, respectively). When examining screening modalities, FOBT use decreased substantially among the intervention practices (Table 4).
This change was largely due to a single practice (P10) that improved their colonoscopy rates but dramatically reduced their use of FOBT.
Qualitative Findings
The quantitative analysis revealed considerable variation in screening rate changes across practices; therefore, we conducted a qualitative analysis to understand the context of practices’ QI implementation to shed light on factors contributing to this variation. Of the 12 intervention practices, 7 were high performers based on rankings of moderate to strong on all or most of the QI implementation characteristics (Table 5).
Three practices (P7, P11, and P15) were low performers based on the ranking of weak to moderate on most of the QI implementation characteristics. Overall, most had moderate to strong team structure, engagement, and intracommunication; most practices also evidenced weak intercommunication. Despite repeated attempts by study personnel to address participation challenges, 2 practices (P17, P21) failed to engage in the intervention at all. In both cases there was evidence that poor communication between practice leaders and other members led to misunderstandings about their participation. Also, practice members reported being overwhelmed with co-occurring events in the practice, such as electronic health record implementation or practice ownership changes.
One pattern was evident across the high- and low-performing practices. The high-performing practices had moderate to strong leadership (except for P22) and psychological safety for this QI intervention, whereas all 3 of the low-performing practices evidenced weak leadership and psychological safety. Although this finding does not signify a causal relationship, it suggests an association between how well leaders fostered team development and the extent to which team members felt safe to engage in the change process.
Using the qualitative 12-month follow-up data, we also found evidence suggesting that the high-performing practices improved their capacity for change more so than the low-performing practices. Three of the high-performers continued to use the team-based RAP model in an adapted form (eg, RAP meetings integrated into practice meetings), and there was evidence that 2 of these practices applied this model to other (non–CRC-focused) QI efforts. In contrast, none of the low performers continued a reflective adaptive process in any form or used the model for other improvements. Major practice changes (such as ownership changes and practice leader turnover) that had occurred by the 12-month follow-up in several practices may have affected their use of the SCOPE model after the intervention ended.
While the preceding results speak to variation across practices, we also explored variation regarding the within-practice congruity of qualitative and quantitative results. One anomaly was evident in practices that did well on the QI implementation characteristics but poorly on their CRC screening rates. The converse reflected a second anomaly. We therefore selected 3 case studies to further explicate connections between practices’ implementation process and their changes in screening rates. P2 illustrates what we hoped for in an intervention study: a practice that had excellent implementation characteristics and had positive increases in their CRC screening rates (Supplemental Appendix 1, available at http://annfammed.org/content/11/3/220/suppl/DC1). This practice had strong relationships as evidenced by a cohesive team, open discussions of proposed QI changes, and a psychologically safe environment where practice members felt comfortable critically reflecting on the current state of the practice. Data and peer stimulus proved to be powerful motivators for their improvement.
In contrast, P10 had a moderate to strong QI implementation yet experienced a dramatic decrease in their CRC screening rates, from 71% to 33% (Supplemental Appendix 2, http://annfammed.org/content/11/3/220/suppl/DC1). For most of the intervention period, the RAP team addressed practice “chaos” and communication issues, and little time was devoted to direct CRC improvement efforts. Although there are likely multiple factors contributing to this decrease, it is plausible that the intervention had an unintended effect on the practice’s screening rates, suggesting that this intervention may have had differing effects—beneficial and adverse—on different types of practices.
Lastly, P15 illustrates a practice that was ranked as weak to moderate on QI implementation but experienced an improvement in CRC screening rates, from 50% to 67% (Appendix 3, http://annfammed.org/content/11/3/220/suppl/DC1). The primary physician in the practice acknowledged that being involved in this project increased his diligence to screen for CRC. Ultimately, the primary physician’s concerted efforts to screen better seemed sufficient to positively affect their screening rates.
DISCUSSION
Project SCOPE tested an intervention model that used a facilitated, team-based approach to improve CRC screening rates in primary care settings. Facilitators tailored the change efforts according to the particular culture and perceived needs of each practice. Although CRC screening rates were emphasized as the focus of the intervention, specific QI objectives and plans rested with the practice members. A central assumption was that getting multiple stakeholder buy-in through this approach would enhance motivation and commitment to the change process. An explicit goal of the study was to develop a practice change model (using CRC screening as an initial focus) that could then be replicated for ongoing change efforts.
Most SCOPE practices were successful in several QI implementation characteristics, including team structure, team member engagement, and intrateam communication. Except for 2 practices that opted not to participate in the intervention, all others formed a RAP team, sent representatives to the learning collaboratives, and worked on 1 or more QI plans, suggesting that this type of intervention model is viable in primary care settings of varying size and structure. Variation between high- and low-performing practices, however, was evident in how well leaders fostered team development and the extent to which team members felt psychologically safe to take risks during the change process. Most teams were not adept at communicating their QI plans with the rest of the practice regardless of size. Moreover, only a few practices adapted the RAP model for use as an ongoing method to identify and work on continuous QI efforts. Organizational disruptions likely affected several practices’ progression of their change capacity. Previous analyses have explored, in depth, additional aspects of QI implementation from the SCOPE trial.25,49
Despite certain successes regarding practices’ QI implementation, overall SCOPE did not yield statistically significant improvements in CRC screening rates. Importantly, the integration of qualitative methods into the study design allowed us to answer recent calls to explore the implementation context of null trials.21 Several lessons learned from SCOPE are important to consider for future interventions.
One lesson was that allowing RAP teams to choose their own QI objectives and plans meant that some practices chose issues that were not directly related to CRC screening. RAP teams that focused on poor communication or chaos in the practice viewed these issues to be of sufficient priority that they needed to be addressed before the teams could delve into concrete clinical improvements. Facilitators prompted teams to keep CRC screening in the foreground, but their discussions often maintained a broader focus on practice dynamics and operations. Although potentially beneficial for the organization in other ways, this aspect of tailoring likely diminished their CRC screening improvements in the time frame of the study.
Another lesson pertained to the notion of spread. RAP teams typically included 1 clinician, and there was variability in how well these leaders fostered a climate of change for the entire practice. Other clinicians in a practice tended not to be aware of or engaged in the CRC improvement efforts, and RAP teams tended to communicate poorly with the rest of the practice regarding QI plans. Even though facilitators emphasized the importance of practice-wide communication regarding their QI efforts, ultimately this responsibility rested with the teams. As a result, segments of a practice improved their capacity for change and CRC screening efforts, but most practices were unsuccessful in effecting organization-wide improvements.
Additionally, because RAP teams were made up of diverse practice members where differing levels of administrative power were evident, teams needed a sense of psychological safety and trust50,51 that supported critical reflection of the change process.49 Facilitators helped foster a safe environment, but it often hinged on the role of practice leaders. High-performing practices had strong to moderate leadership and psychological safety, whereas low-performing practices were weak in both of these areas. Future interventions must pay attention to the role of practice leaders given their influence on team dynamics and the change process. Interventions using a team-based approach may benefit from incorporating instructional components for practice leaders to enhance their knowledge and skills in leading QI teams.
Lastly, SCOPE employed generalist facilitators who had expertise in organizational change and group process but not in cancer screening. As such, facilitators were not relied upon for giving practices CRC solutions. Instead, practices were encouraged to develop their organizational learning capacity to identify and implement their own solutions. Although the facilitators were well-suited to concentrate on the change process—eg, prompting teams to confront and deal with barriers to change—having facilitators who also had expertise in the target condition likely could have had more direct benefits on practices’ CRC screening efforts.
We recognize several limitations to our study. Power to detect an intervention effect was limited by the small number of participating practices, which was compounded by the failure of 2 practices to participate in the intervention as prescribed. This lack of fidelity would lead to attenuation of the intervention effect and, thus, reduced power. Moreover, the higher than expected CRC screening rates on average affected our ability to detect a significant increase in CRC screening rates. We also would expect that the volunteer practices in our sample were sufficiently motivated to improve CRC screening rates, which may differ from practices in general. Lastly, based on previous work showing that the level of uncertainty associated with a disease is a critical factor for intervention design,52 we acknowledge that CRC screening may preclude the need for an intensive, team-based model of practice improvement53–55 and, therefore, may have affected the change process and, consequently, the intended goal of extrapolating the change model to other diseases or areas of improvement.
Various QI approaches and methods can be effective in achieving targeted outcomes. Yet practice transformation, such as that envisioned by the patient-centered medical home, cannot be realized through only a series of incremental QI projects. Developing greater organizational learning and change capacities is required. The SCOPE intervention sought to bridge the gap between an externally orchestrated, single-target intervention and full-scale, emergent practice transformation. The response of practices to the SCOPE intervention suggests that this QI approach (ie, MAP/RAP, including facilitated team meetings and learning collaboratives) may not be an optimal strategy for single-target interventions, particularly if short-term guideline concordant numerical gains are the goal. The MAP/RAP approach provides considerable flexibility in the improvement focus a practice can take, as well as the strategies to get there. If improving performance measures for a preselected target, such as CRC screening rates, is the focus, perhaps a more traditional targeted continuous QI approach would be more appropriate. Nevertheless, because there are so many potential disease-specific and patient-centered targets in need of improvement in primary care, relying on a series of single-target QI interventions may not be realistic.
Methodologically, the SCOPE study shows that quantitative and qualitative findings should not be seen as a way to merely confirm or disconfirm each other. In some cases, SCOPE results reveal discordance in the 2 types of data, which might tempt us to think one or the other is wrong. Rather, integrating both views into an overarching analysis of the study provides a richer understanding of the intervention. Advancing the knowledge base of QI interventions requires future reports to address how and why QI interventions work rather than simply measuring whether they work.
Footnotes
-
Conflicts of interest: authors report none.
-
Funding support: This research was supported by a grant from the National Cancer Institute (R01 CA112387-01). Dr Crabtree’s time was supported in part by a senior investigator grant from the National Cancer Institute (K05 CA140237).
- Received for publication February 24, 2012.
- Revision received September 24, 2012.
- Accepted for publication October 19, 2012.
- © 2013 Annals of Family Medicine, Inc.