1 / 29

NCHS Record Linkage Activities

U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease Control and Prevention National Center for Health Statistics. NCHS Record Linkage Activities. Kimberly A. Lochner Christine S. Cox NCHS Data Users Conference July 11, 2006. Overview. What is record linkage?

crescent
Download Presentation

NCHS Record Linkage Activities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease Control and Prevention National Center for Health Statistics NCHS Record Linkage Activities Kimberly A. Lochner Christine S. Cox NCHS Data Users Conference July 11, 2006

  2. Overview • What is record linkage? • Why do we do it? • How does NCHS link data? • What NCHS data has been linked? • What are the limitations? • How do you access the data?

  3. What is record linkage? NCHS Surveys Administrative records Linked Data File

  4. Why do record linkage? • Scientifically valuable & cost effective • Augments information • Re-contacting survey respondents expensive • Expands analytic potential, e.g. • Provides longitudinal component to data • Allows for the evaluation of policies and programs

  5. How do we link records? • Obtain and standardize survey data • Create user submission records • Use a matching algorithm • Score and classify potential matches • Deterministic/probabilistic matching algorithms • Determine matches/review some cases • Create a final linked file

  6. Typical ID Data Used for Record Linkage • Social Security Number (SSN) • First Name • Middle Initial • Last Name (and Birth Surname) • Month, Day and Year of Birth • Sex • State of Birth • Race • State of Residence • Marital Status

  7. Mortality (NDI) Medicare (CMS) Retirement & Disability (SSA) NHIS 1986-2000 X NHIS 1994-1998 X X X LSOA II X X X NHANES I X X X NHANES II X X NHANES III X X X NNHS 1985 X X Summary NCHS Data Linkage

  8. NCHS Linked Data: Mortality • National Death Index (NDI) • NHIS 1986-2000 • Mortality follow-up through 2002 • Longitudinal Study of Aging II (baseline NHIS 1994) • Mortality follow-up through 2002 • NHANES I (baseline 1971-74) • NHANES II (baseline 1976-80) • NHANES III (baseline 1988-94) • All NHANES mortality follow-up through 2000

  9. Mortality: Data Elements • Public ID • Eligibility status • Assigned vital status • Date of death • Age at death • Underlying and multiple causes of death • Sample weights • Special request variables

  10. NCHS Linked Data: Medicare • NCHS survey linked to Medicare data • NHIS 1994-1998 • Includes disability supplements in 1994 and 1995 • LSOA II (baseline 1994 NHIS) • NHANES I (baseline 1971-74) • NHANES II (baseline 1976-80) • NHANES III (baseline 1988-94)

  11. NCHS Linked Data: Medicare • Medicare entitlement and health care utilization and payment data for 1991-2000 • Denominator file • MEDPAR Inpatient hospitalization • MEDPAR Skilled nursing facility • Hospital outpatient • Home Health Care • Hospice • Carrier (physician/supplier Part B file) • Durable Medical Equipment

  12. Medicare: Data Elements • Public ID • Eligibility status • Match status • For each Medicare file • For each year 1991-2000 • Linkage age (i.e., assumed age of survey participant at time of linkage - July 2001)

  13. Medicare: Data Elements • Denominator file • Entitlement status • Beneficiary demographic characteristics • Monthly enrollment status • HMO enrollment • Claims files • Diagnoses codes • Service dates • Reimbursement amount

  14. NCHS Linked Data: Retirement/Disability • NCHS surveys linked to Social Security data • NHIS 1994-1998 • LSOA II (baseline 1994 NHIS) • NHANES I (baseline 1971-74) • NHANES III (baseline 1988-94) • National Nursing Home Survey (1985)

  15. NCHS Linked Data: Retirement/Disability • Social Security data from Retirement, Survivors, and Disability Insurance (RSDI) and Supplemental Security Insurance (SSI) programs • Master Beneficiary Record (MBR) • 1962-2003 • Payment History Update System (PHUS) • 1984-2003 • Supplemental Security Record (SSR) • 1974-2003

  16. Social Security: Data Elements • Public ID • Eligibility status • Match status • For each SSA file • Linkage age (i.e. assumed age of survey participant at time of linkage - July 2001)

  17. Social Security: Data Elements • RSDI information • Master Beneficiary Record (MBR), 1962 - 2003 • RSDI program eligibility, benefit amount, payment status, dual entitlement • Payment History Update System (PHUS), 1984-2003 • RSDI benefit payment amounts, including withholding information for Medicare Part B premiums • Actual benefit payment in a given month (Form 1099)

  18. Social Security: Data Elements • SSI information • Supplemental Security Record (SSR), 1974 to 2003 • SSI program eligibility • For those eligible for SSI • basic demographic information (sex & race) • benefit information • actual payment amounts • sources and amounts of other income information

  19. What are the limitations? • Quantity and quality of identification data • High refusal rates for key identifiers • Incomplete or inaccurate reporting/recording of identification data in the survey interview • Legitimate changes in data over time • Inability to match records leads to potential bias in linked files

  20. Limitations: some examples • Survey respondents ineligible for matching • Cannot attempt to match their records to other data sources – Why? • Refused to provide SSN • Under the age of 18 (for mortality only) • Lack key identifying information • Ineligibles MUST BE DROPPED from all analysis

  21. Percent of ineligible and deceased adult NHIS respondents by survey year : NHIS 1986-2000

  22. Ineligible Population and Medicare Linkage Rate by Survey (65+ years)

  23. www.cdc.gov/nchs/r&d/rdc.htm Email:: rdca@cdc.gov Phone: (301)458-4732 How do you access the data? • NCHS Research Data Center (RDC) provides access to the restricted linked data files • Access Methods • On-site access – NCHS, Hyattsville, MD • Off-site access - User’s remote location • Staff assisted – on-site programming for remote researchers

  24. www.cdc.gov/nchs/r&d/nchs_datalinkage/data_linkage_activities.htmwww.cdc.gov/nchs/r&d/nchs_datalinkage/data_linkage_activities.htm

  25. Estimated RDC Fees for Linked Mortality Data Access User Fees: Mortality Data

  26. User Fees: SSA & CMS Data Estimated RDC Fees for Linked SSA/CMS Data Access

  27. Research Potential ofLinked Medicare Data • Examine risk factors for health conditions • Compare survey reported health conditions to claims records • Examine reliability of survey data • Compare survey reported Medicare enrollment to Medicare claims records • Examine survey report of disability with program participation eligibility criteria • Examine disparities in Medicare service utilization

  28. NCHS Survey Total Deaths NHIS 1986-2000 121,138 LSOA II 3,958 NHEFS 6,656 NHANES II 4,143 NHANES III 3,384 Linked Mortality FilesNumber of Deaths by Survey NHIS and LSOA II have mortality follow-up through 12/31/2002. NHEFS, NHANES II and III have mortality follow-up through 12/31/2000.

  29. Limitations: some examples (cont) • Names • Nick name conversion (e. g Beth/Elizabeth or Bill/William) • Hispanic Naming Conventions (e.g. Alberto Ruis De La Rosa) • Doesn’t fit into standard data fields • Clustering of Hispanic names • Date of birth misreporting (MM/DD/YYYY) • Allow matches on separate components • MM/DD or MM/YYYY • For year, allow +/- 1

More Related