1 / 6

SHARE Data Cleaning

SHARE Data Cleaning. Stephanie Stuck MEA Antwerp February 6 th /7 th 2008. Interviewer remarks. Categorize problems as much as possible Write programs to correct data if possible Flag cases where unsure

oshin
Download Presentation

SHARE Data Cleaning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SHARE Data Cleaning Stephanie Stuck MEA Antwerp February 6th/7th 2008

  2. Interviewer remarks • Categorize problems as much as possible • Write programs to correct data if possible • Flag cases where unsure • Collect information on questions that caused a lot of problems / didn’t work for working groups, country team, users • We do not need to translate all the remarks, but summarize important information • See also Laura’s presentation (Frankfurt)

  3. General checks • Corrections based on checks of frequency distributions, e.g. outliers, values out of range • Corrections based on consistency checks • within and between modules and waves

  4. More concrete • Check year of birth between coverscreen (cv_r and cv_h) and dn module, drop-offs and vignettes respectively, and possibly with the gross sample • Check gender CV/DN vs. drop-off/vignettes • Check for consistency of dates: • Check information on marital status: • Check respondent dummies • Check ch module against coververscreen • Check relation to coverscreen respondent • See also Omar’s presentations (Vienna, Frankfurt)

  5. Financial modules Check financial amounts for implausible values, e.g. negative or very high amounts • outliers • zero values • wrong currencies • typing errors • consider frequencies of payments etc. • See also Dimitri’s and Mario’s presentations (Frankfurt)

  6. Wrong sampid, cvid or respid MEA already checks automatically for mismatches within and between waves • Information from survey agencies about wrong ids, mismatches etc. ‘single cases’ • Please check if it is already corrected in the data • Write programs to correct remaining problem cases MEA will send you information about corrections done during fieldwork by CentERdata

More Related