From health research to social research: Privacy, methods, approaches
Introduction
Weiner (1999) has stressed how science is shaped by ‘entry points’ that facilitate important research. One such entry point has been the development of record linkage, “the bringing together of information from two records that are believed to relate to the same individual or family. Linkage is achieved by comparing a limited subset of the total available information, using specified ‘linkage variables,’ selected for their ability to uniquely and reliably identify an individual” (Black & Roos, 2005). Record linkage is now being used routinely by a number of Canadian and Australian research centers to maintain registries, to manage multi-file databases, to create longitudinal histories, and to work across data sets. Linkage is critical for expanding population-based research beyond its historical ‘home’ with health care information to additional topics more generally connected with well being. Such linkage has proven to be efficient, relatively inexpensive, and protective of privacy, compared with approaches that would collect new information for each study question (Black & Roos, 2005; Trutwein, Holman, & Rosman, 2006).
Previous work has noted the creation of “information-rich environments” built using population-based health insurance systems, longitudinal administrative data, and record linkage techniques. At least six groups working in such environments (Oxford, Scotland, Western Australia, and three Canadian centres: Manitoba, Ontario, and British Columbia) have demonstrated high research productivity while maintaining privacy and confidentiality. Characteristics of these centres—including their use of data for both government-funded policy work and investigator-initiated research—have been described elsewhere (Goldacre, Griffith, Gill, & Mackintosh, 2002; Holman, Bass, Rouse, & Hobbs, 1999; Kendrick, Douglas, Gardner, & Hucker, 1998; Mekel & Shortt, 2005). These environments provide the capacity to analyze interventions longitudinally; to draw cohorts and construct control groups; to compare regions, areas and hospitals in defined populations; to combine information on patients and physicians; and to compile expenditures for different services (Roos, Menec, & Currie, 2004).
Understanding the potential of administrative data is facilitated by comparison with longitudinal clinical and survey studies carried out in several countries (Lawlor et al., 2005; Power & Hertzman, 1999; Silva & Stanton, 1996). Perhaps the most influential, widely used survey has been the Panel Study in Income Dynamics (PSID) based at the University of Michigan's Institute for Social Research. The PSID was the only social science study noted on “the National Science Foundation's list of its 50 most significant projects in its 50 year history” (Duncan, Hofferth, & Stafford, 2004, p. 158).
New Manitoba databases have provided opportunities for using administrative data to study health, education, labor force participation, and general well being. Despite the Manitoba Centre for Health Policy's (MCHP) many years of experience, incorporating information from the Ministry of Education, Citizenship, and Youth and from the Ministry of Family Services and Housing involved a substantial learning curve. Comparisons with the PSID highlight the issues in extending population-based administrative data to facilitate social research.
Section snippets
Panel study in income dynamics (PSID)
Costly primary data collection and considerable attention to design are characteristic of the PSID and other longitudinal surveys (Table 1). Beginning in 1968, the PSID has included annual or biennial waves of data collection on about 5000 nationally representative households, the inclusion of families and a second generation of respondents, corrections for immigration, the reconstruction of residential location to aid research on neighborhood effects, and the wide variety of questions asked
Expanding capabilities
Moving from health research to social research involves considerable preparatory work. Databases must be organized to:
- (a)
Link files to incorporate new data sets (for example, those on education and labor force participation) while preserving privacy and confidentiality.
- (b)
Measure such outcomes as educational achievement at the population level.
- (c)
Use place of residence data (for any point in time) to calculate the number of moves, number of years in relatively poor neighborhoods, upward and downward
Beyond health research
A broad range of measures (including physical growth, cognitive and behavioral development, and school performance) are needed to better understand well being in the early life course (Hertzman & Power, 2006). Measures available for the Manitoba population for one or more years include age, grade level, school attendance, and marks for Grade 3, Grade 9, and Grade 12. Specific well-identified health conditions (such as asthma and diabetes), as well as a measure of general health status, can be
Provincial and state centers
Major national surveys such as the PSID often provide representativeness, but not enough cases for local analyses. Maintaining a population registry for a single province or state is easier than trying to do so for a larger political entity. Nonetheless, the United Kingdom has begun “to build a lifelong health- records database for the 50 million patients” in the National Health Service system (Herrera, 2005, p. 73), while Scandinavian registries have facilitated a number of studies (Bjorklund,
Acknowledgments
This research was supported by the Canadian Institutes of Health Research (Manitoba Health No. 2004/2005–27), the Canadian Institute for Advanced Research, the Manitoba Centre for Health Policy, and an award from the RBC Financial Group (Manitoba Health No. 2003/2004–32). The Manitoba Ministries of Health, of Education, Citizenship, and Youth, and of Family Services and Housing have provided the data for the Population Health Research Data Repository. The results and conclusions are those of
References (89)
The significance of life events as etilogic factors in the diseases of children. I:A survey of professional workers
Journal of Psychosomatic Research
(1972)Building an inter-disciplinary science of health inequalities: The example of lifecourse research
Social Science & Medicine
(2002)- et al.
Population-based linkage of health records in Western Australia: Development of the health services research linked database
Australian and New Zealand Journal of Public Health
(1999) - et al.
Research use of linked health data-a best practice protocol
Australian and New Zealand Journal of Public Health
(2002) - et al.
Place effects on health: How can we conceptualise, operationalise and measure them?
Social Science & Medicine
(2002) - et al.
Assessing ecologic proxies for household income: A comparison of household and neighbourhood-level income measures in the study of population health status
Health and Place
(1999) - et al.
Adapting a clinical comorbidity index for use with ICD-9-CM administrative data: Differing perspectives
Journal of Clinical Epidemiology
(1993) - et al.
Socioeconomic determinants of mortality in two Canadian provinces: Multilevel modelling and neighborhood context
Social Science & Medicine
(2004) - et al.
Policy analysis in an information-rich environment
Social Science & Medicine
(2004) - et al.
A research registry: Uses, development, and accuracy
Journal of Clinical Epidemiology
(1999)
Health data linkage conserves privacy in a research-rich environment
Annals of Epidemiology
Integrated longitudinal employer-employee data for the United States
American Economic Review
Effects of socioeconomic status on access to invasive cardiac procedures and on mortality after acute myocardial infarction
New England Journal of Medicine
Influences of nature and nurture on earnings variation: A report on a study of various sibling types in Sweden
Linking and combining data to develop statistics for understanding the population's health
The more the merrier? The effect of family size and birth order on children's education
Quarterly Journal of Economics
From the cradle to the labor market? The effect of birth weight on adult outcomes
Quarterly Journal of Economics
Place effects for areas defined by administrative boundaries
American Journal of Public Health
Are socioeconomic gradients for children similar to those for adults? Achievement and health of children in the United States
Is the class half-empty? Socioeconomic status and educational achievement from a population-based perspective
IRPP Choices
A reanalysis of marital stability in SIME/DIME
American Journal of Sociology
Children's health and social mobility
Future Child
The rhetoric and reality of gap closing: When the “have-nots” gain but the “haves” gain even more
American Psychologist
Socioeconomic differences in children's health: How and why do these relationship change with age?
Psychological Bulletin
The PSID and me
Sibling, peer, neighbor, and schoolmate correlations as indicators of the importance of context for adolescent development
Demography
Evolution and change in family income, wealth, and health: The panel study of income dynamics, 1968–2000 and beyond
Getting context right in quantitative studies of child development
Imagined worlds
Measuring population health: A review of indicators
Annual Review of Public Health
Clinical effectiveness of influenza vaccination in Manitoba in 1982–1983 and 1985–1986
Journal of the American Medical Association
An analysis of sample attrition in panel data: The Michigan panel study of income dynamics
Journal of Human Resources
Family structure and children's educational outcomes: Blended families, stylized facts, and descriptive regressions
Demography
In-hospital deaths as fraction of all deaths within 30 days of hospital admission for surgery: Analysis of routine statistics
British Medical Journal
Delivering equitable care: Comparing preventive services in Manitoba, Canada
American Journal of Public Health
Trends in children's attainments and their determinants as family income has increased
Succeeding generations: On the effects of investments in children
Matching as an econometric evaluation estimator: Evidence from evaluating a job training program
Review of Economic Studies
Chronic pain: Why we still need a strong, healthy FDA
Technology Review
A life course approach to health and human development
Economics of education
NBER Reporter Fall
Cited by (89)
Strategies for constructing household and family units with linked administrative records
2021, Children and Youth Services ReviewCitation Excerpt :Data collected from agencies and programs that deliver services to children and families offer some advantages over smaller sample-based studies and nationally representative surveys. First, administrative data are widely available and considerably less expensive than research involving fieldwork or complex sampling techniques (Christensen, 1958; Dunn, 1946; Roos et al., 2008). Second, administrative datasets involve well-defined service populations.
Acute Kidney Injury Events in Patients With Type 2 Diabetes Using SGLT2 Inhibitors Versus Other Glucose-Lowering Drugs: A Retrospective Cohort Study
2020, American Journal of Kidney DiseasesA Matched Cohort Study of the Association Between Childhood Sexual Abuse and Teenage Pregnancy
2019, Journal of Adolescent HealthCitation Excerpt :This provides reliable evidence that can inform and guide teenage pregnancy healthcare practice [33]. Moreover, this relatively inexpensive design [34,35] ensures access to teenagers who have been identified by governmental instances [36], which created a relatively large sample with adequate statistical power. This matched cohort design also limits attrition over time and remediates the social desirability and memory biases associated with self-reports and retrospective designs [24,37,38].