Abstract
PURPOSE To develop and evaluate a concise measure of primary care that is grounded in the experience of patients, clinicians, and health care payers.
METHODS We asked crowd-sourced samples of 412 patients, 525 primary care clinicians, and 85 health care payers to describe what provides value in primary care, then asked 70 primary care and health services experts in a 2½ day international conference to provide additional insights. A multidisciplinary team conducted a qualitative analysis of the combined data to develop a parsimonious set of patient-reported items. We evaluated items using factor analysis, Rasch modeling, and association analyses among 2 online samples and 4 clinical samples from diverse patient populations.
RESULTS The resulting person-centered primary care measure parsimoniously represents the broad scope of primary care, with 11 domains each represented by a single item: accessibility, advocacy, community context, comprehensiveness, continuity, coordination, family context, goal-oriented care, health promotion, integration, and relationship. Principal axes factor analysis identified a single factor. Factor loadings and corrected item-total correlations were >0.6 in online samples (n = 2,229) and >0.5 in clinical samples (n = 323). Factor scores were fairly normally distributed in online patient samples, and skewed toward higher ratings in point-of-care patient samples. Rasch models showed a broad spread of person and item scores, acceptable item-fit statistics, and little item redundancy. Preliminary concurrent validity analyses supported hypothesized associations.
CONCLUSIONS The person-centered primary care measure reliably, comprehensively, and parsimoniously assesses the aspects of care thought to represent high-value primary care by patients, clinicians, and payers. The measure is ready for further validation and outcome analyses, and for use in focusing attention on what matters about primary care, while reducing measurement burden.
INTRODUCTION
Measures matter because they focus the precious commodity of attention.1 Increasingly, measures also are used to concentrate material resources and infrastructure, and even to influence the right to practice, often with unintended consequences.2,3
Ideally, measures should provide information that is understandable and actionable by key stakeholders.1,4 The growing number of patient-reported measures recognize the patient as the most knowledgeable informant about many important aspects of care,5–11 including the Patient Reported Outcomes Information Systems measures.12
Narrowly focused measures make sense for narrowly focused care. But primary care, subjected to the largest burden of measurement,13,14 is also subjected to a measurement model that does not match the importance of much that it does.1,15–18 Adding up disease-specific measures misses and devalues the higher-level functions of integrating, personalizing, and prioritizing care for people and populations.1,13,19
A number of measures have been developed to assess different aspects of primary care.10,11,20–25 Unfortunately, they tend to be long and seldom used outside of the research setting. Clinical primary care settings often turn to patient experience surveys, such as the Clinician and Group Consumer Assessment of Healthcare Providers and Systems, that researchers have recently sought to shorten in order to increase its use.26 Patient experience measures focus important attention on the consumer experience of care delivery and receipt of services, but fall short of focused attention on the broad scope of primary care.1,15
Needed is a measure, grounded in the combined experiences of patients, clinicians, and payers, that engages the most informed reporter—the patient—to assess vital functions of primary care that are lost by current reductionist measures. Such a measure must be responsive to the current clinical environment that is buckling under the weight of measures that are onerous to manage and time consuming to complete.15,17,20
Therefore, we set out to: (1) identify what matters in primary care, and use that understanding to (2) develop a parsimonious measure of what matters, made practical to use by assessing each domain with a single item, and then (3) to conduct reliability and preliminary concurrent validity analyses.
METHODS
We used a multistep approach to identify and refine a parsimonious set of items assessing the related processes that constitute high-value primary care. We then evaluated the factor structure, internal consistency reliability, and concurrent validity of a measure developed from these items. Each part of the process was determined to be exempt regarding human subjects research by the Virginia Commonwealth University Internal Review Board.
Advance Work to Conceptualize Measurement Domains From Diverse Perspectives
As previously described, we conducted a crowd-sourced survey to identify a preliminary set of quality indicator domains of greatest use and importance to diverse stakeholders.15 We chose to use crowdsourcing—widespread distribution among a large group of Internet-based volunteers—as a method to gain participation of populations not usually included in measure development. Using this method, we fielded open-ended and structured questions to patients, clinicians, and employer-purchasers of health care plans.
We asked 525 primary care clinicians, 412 patients, and 85 employers how they know good care when they see it, and what aspects of primary care are most important to them. Most (65%) clinicians (n = 525) were nonacademically affiliated and spanned diverse settings (eg, private practice, community health centers, multi-specialty groups). Responding patients were 54% female, 20% self-identified members of a minority group, well-distributed among age groups (17% aged 18-29 years, 29% aged 30-44 years, 24% aged 45-60 years, 30% aged over 60 years), and represented all 50 states. Most reported having a usual source of primary care (83%). A snowball sampling method was used to reach purchasers of health plans for organizations employing 1 to >100,000 employees. A multidisciplinary team analyzed data using a grounded approach to identify 18 quality indicator domains for primary care, representing those areas with a majority of overlap among patients, clinicians, and payers.15
A list of the 18 quality indicator domains, circulated with conference briefs and primer reading materials, was distributed to attendees of the Starfield Summit III in Washington, DC, October 4-6, 2017. From there, we convened a diverse group of experts to consider what matters about primary care and how it can be measured.27 Participants included patients, practicing clinicians (in nursing, social work, family medicine, pediatrics, internal medicine, and occupational therapy), international primary care leaders, actuaries, insurers, employers, policy makers, and professional association leaders. During the conference, the 18 previously identified quality domains were refined, revised, and reduced to 11 domains. Subsequently, a multidisciplinary team (the authors) immersed in the recorded data and notes, and extracted the most salient arguments and overlapping interests of attendees to generate the precise wording of each domain item in the first draft of the proposed measure. That draft was shared with all conference participants for member checking, and with external experts to confirm likely ease of use.
This deeply engaged process, involving over 1,000 stakeholders of varying backgrounds, diverse opinions, and competing interests, resulted in the development of a parsimonious set of patient-reported items for assessing what provides value in primary care.28
Measure Fielding, Description, and Psychometric Analyses
Evaluation Samples
In order to evaluate the psychometric properties of this new Person-Centered Primary Care Measure (PCPCM), we sampled 2 populations: people responding to the measure outside the context of a specific care visit (online sample) and those responding to the measure at the point-of-care (clinical sample).
We identified the online sample through SurveyMonkey by requesting a minimum of 1,000 participants with diversity in geography, income, age, and sex. An initial online sample was used to assess factor structure; a second online sample was used for cross-validation.
Following the online sample, we engaged a clinical sample of 323 participants with an almost even distribution among the following primary care settings: a community health center with a predominantly Medicaid population, independent and hospital-owned private family practices, and a pediatric hospital-owned practice. Surveys were offered to consecutive patients in waiting rooms by a research team member after a care episode. Participation was voluntary, uncompensated, and most patients were able to complete the PCPCM in under 2 minutes. Among consecutive patients, approximately 50% declined participation.
Analysis Process
We conducted 3 sets of psychometric analyses on the online exploratory and validation samples, and on the combined clinical sample.
The first psychometric analysis focused on construct identification. In order to identify the number of constructs represented among items identified by our process, we conducted exploratory principal axes factor analysis in the initial online sample, and repeated the analysis in a second online sample. We further confirmed the resulting single factor by examining the scree plot and the large Eigenvalue. We interpreted factor loadings of >0.4 as a good association between the item and the construct being assessed, loadings >0.6 demonstrating a strong association, and those >0.8 demonstrating a very strong association.
The second psychometric analysis examined 2 types of reliability. Corrected item-total correlations were computed for each of the 11 core items, followed by Cronbach’s α reliability coefficient for the scale as a whole.
Next, Rasch item fit statistics were computed for each item of the factor. In Rasch modeling, when all items in a measure are a good fit, evidence of construct validity of the measure is provided.29,30 Findings from this analysis can reveal variation in level of question (item) difficulty. Item sets of varying difficulty allow greater ability to see variation among responding populations. In addition, Rasch item reliability statistics were computed to assess the level of confidence that items would have the same level of difficulty in another sample of participants.31 Cronbach α reliability statistics >0.8 and Rasch item reliability statistics >0.9 represent excellent internal consistency reliability for both. All Rasch analyses were computed using WINSTEPS 4.10 software and based on the Rasch partial credit model.32,33 We report Eigenvalues, replicated factor loadings, corrected item-total correlations, item difficulty estimates, descriptive statistics, and Cronbach’s α reliability for each sample.
Our third set of psychometric analysis focused on concurrent validity, as a first step in assessing the construct validity of the new measure. We examined the measure’s association with participant characteristics and with other validated patient-reported measures, specifically the What Matters Index,11,34 which has demonstrated a positive association with cost and utilization of health services, and the Patient Enablement Index,10 a measure of a patient’s ability to understand and cope with their health issues as a result of the care received. For comparative analyses, we used t tests and analysis of variance for continuous variables and c2 for categorical variables.
Based on prior research, and clinical and research experience, we hypothesized there would be a positive association between a higher PCPCM score and patients of greater age, patients receiving most of their care from a single physician, the more years a patient knew the physician, a higher What Matters Index score, and a higher Patient Enablement Index score. Additionally, the clinical sample was expected to have a mean and distribution that skewed toward a higher PCPCM score than the online sample. In contrast, a negative association was hypothesized for patients with minority status and the type of device used to administer the questions was anticipated to be neutral.11
RESULTS
Preliminary results of the crowd-sourced analyses15 and the Starfield Summit III28 results are publicly available. The crowdsourcing revealed strong patient and clinician congruence around the importance of the relational experience of care delivery, whereas payers emphasized transactional aspects of care. The Starfield Summit III results revealed that primary care is dynamic, adaptive, and relationship-based, with domains so interrelated that they must be measured as a whole.
Synthesizing analyses of crowd-sourced and conference data resulted in the 11 succinct items listed in Supplemental Appendix 1, available at http://www.annfammed.org/content/17/3/221/suppl/DC1. These patient-reported items represent the broad scope of primary care, with each domain represented succinctly with a single item.
Table 1 shows the characteristics of participants in the combined online and combined clinical samples. Missing data reflect nonmeasure survey questions that were not asked of participants in all samples. In general, the samples show participant characteristics typical of primary care populations.
Table 2 shows the results of the factor and psychometric analyses for the cross-validation online sample and the combined clinical sample. The cross-validation online sample was used to demonstrate replication of findings from the exploratory online sample. All analyses show a single factor with a second factor Eigenvalue <1.0. The first Eigenvalue for the cross-validation online sample was 6.9, representing 63% of the variance, and 4.7 for the combined clinical sample, accounting for 43% of the variance (data presented in Supplemental Appendix 2, http://www.annfammed.org/content/17/3/221/suppl/DC1). For both samples, all 11 patient-reported items show strong positive associations with the factor, with factor loadings and item-total correlations >0.6 in the cross-validation online sample and generally >0.5 in the combined clinical sample. Moreover, for both samples, Rasch item fit statistics ranged from 0.62 to 1.44 for the cross-validation online sample and from 0.55 to 1.49 for the combined clinical sample. All items of both samples were in the acceptable range (0.5 to 1.5) for fit. Rasch item reliability was 0.99 for the cross-validation online sample, and 0.98 for the combined clinical sample. Cronbach’s scale reliability (α) was 0.95 for the cross-validation online sample and 0.91 for the combined clinical sample.
Upon completion of statistical analysis, we named the identified single factor the Person-Centered Primary Care Measure (PCPCM). The PCPCM represents a sum of the item responses divided by the number of responses. Scores can range from a low of 1.0 to a high of 4.0, with higher scores indicating patients reporting a greater frequency of experiencing the domains of primary care addressed by the items.
Figures 1A and 1B show the distribution of PCPCM scores in the combined online and clinical samples, respectively. The combined online scores show the maximum spread from 1.0 to 4.0, whereas the combined clinical sample is skewed toward more positive numbers, as expected.
Figures 2A and 2B show the conventional Rasch model person-item map for the cross-validation online sample and the combined clinical sample. The person-item maps display the location and distribution of both items and patient scores on the same common logit metric. These maps show that the 11 items are distributed across a range of difficulty, as depicted on the left (item) side of the map, and show person scores spread across a wide range of responses, as shown on the right (person) side of the map. This parsimonious measure of diverse primary care mechanisms also reveals minimal item overlap in the item pool, shown in Figures 2A and 2B by the spread in item difficulty values. Item difficulty estimates were statistically different from each other for 10 of the 11 items in the cross-validation online sample and for 9 of the 11 items in the clinical sample. The fact that a different redundant pair appears in each sample reflects greater sample fluctuation than replicated redundancy (item calibrations, standard errors, and fit statistics found in Supplemental Appendix 2).
Measures of association that assess the concurrent validity of the PCPCM are shown in Table 3. These associations are in the hypothesized direction (see Analysis Process above) except for minority status which was nonsignificant. Moreover, rank-ordered associations were observed for income, whether the survey was hard to complete, and whether respondents felt that clinician awareness of their PCPCM responses would positively inform their care. No associations were observed for region, mode of administration, or sex. Concurrent validity is further demonstrated by the strong associations of PCPCM scores with the What Matters Index and the Patient Enablement Index.
Together, these analyses provide evidence of concurrent validity for the new measure.
DISCUSSION
The Person-Centered Primary Care Measure (PCPCM) described here is unusual with a combination of robust internal consistency and great breadth. The resulting combination of brevity (a single item representing diverse domains) and conceptual coherence (analysis of a single factor that represents the breadth of primary care) is attributed to the comprehensive preparatory work with diverse stakeholders that enabled the development of meaningful measure items. The PCPCM adds to our field by empirically demonstrating that the broad focus of primary care is conceptually coherent, as seen and reported by the key stakeholder—patients. Our analysis supports what we heard from patients, clinicians, and Starfield Summit III participants, that valued aspects of primary care are not always reached using a sum of parts focus on clinical processes and outcomes (eg, diagnostic tests, medication management, preventive services). The most meaningful primary care measure is the one most able to assess primary care as a whole. Of note, the measure created through this process does not focus attention on the experience of care delivery as defined by clinical processes and outcomes alone. While that experience is clearly represented within the PCPCM, the measure goes further to focus attention on care aspects that contribute to patient perceptions regarding the integrating, prioritizing, and personalizing functions of primary care, a whole assessed by the most trustworthy reporter— the patient.
The approaches used in this study point the way toward a new, pragmatic, parsimonious approach to measurement that involves a single well-grounded item for multiple domains representing the broad, integrative, generalist field of primary care. This new measure provides a practical approach that allows the breadth of general practice to be assessed without the untenable burden of representing each construct with multiple measures.
With its combination of breadth, internal consistency, and parsimony, the PCPCM complements other existing measures of primary care. The measure’s unique exposition of specific attributes of primary care allows evaluation of the specific mechanisms by which primary care adds value, and thus complements more global assessments of primary care, such as having a usual source of care.35–40 The PCPCM also may be useful in research and quality improvement work in which reducing respondent burden is important. In its brevity, the PCPCM complements other patient-reported measures of primary care that measure fewer domains, but with multiple items per domain,10,20,22,23,41–48 or that measure aspects of primary care for specific purposes.45,47,49,50
The PCPCM shows encouraging psychometric properties, but further reliability testing in other samples is warranted. Additional assessment of concurrent and predictive validity would add to the validity assessed in this paper. The samples used for the analyses presented here are useful for assessing the psychometric properties of the new measure, but further reliability and validity assessments in diverse clinical samples are needed to assure measure robustness.
In the meantime, the PCPCM can be used in research and quality improvement efforts to understand the mechanisms by which primary care affects outcomes for patients, health care systems, and populations. It can be administered after visits or to populations of patients. Both the total score and individual items can be used to inform clinicians, patients, and health care system efforts to focus attention, energy, time, and systematic support on what matters beyond narrow measures of disease, satisfaction, or care volume. Future research should compare PCPCM to other measures used to predict health and health care use outcomes. Pending these comparisons, the conciseness of the PCPCM has potential to reduce the current large measurement burden by replacing measures that are longer or that do not as squarely represent core domains that have been identified as important by patients, clinicians, and policy makers.
Acknowledgments
The authors thank the American Board of Family Medicine Foundation for funding significant portions of this work. Salary support for Kurt Stange was provided as a Clinical Research Professor of the American Cancer Society and a Scholar of The Institute for Integrative Health. Salary support for Rebecca Etz was provided by a VCU Faculty Excellence Award. The authors thank participants at all stages of the measure development process, including the attendees of the Starfield Summit III, with particular thanks to Drs CJ Peek, Bob Phil-lips, Will Miller, David Loxterkamp, and Larry Green.
Footnotes
Conflicts of interest: authors report none.
To read or post commentaries in response to this article, see it online at http://www.AnnFamMed.org/content/17/3/221.
Funding support: The Starfield Summit III was made possible through funding from the Agency for Healthcare Research and Quality (1R13GS025312-01), the American Board of Family Medicine Foundation, Family Medicine for America’s Health, the North American Primary Care Research Group, and Virginia Commonwealth University.
Supplementary materials: Available at http://www.AnnFamMed.org/content/17/3/221/suppl/DC1/.
- Received for publication October 23, 2018.
- Revision received March 18, 2019.
- Accepted for publication March 21, 2019.
- © 2019 Annals of Family Medicine, Inc.