60 likes | 459 Views
SHARE Data Cleaning. Stephanie Stuck MEA Antwerp February 6 th /7 th 2008. Interviewer remarks. Categorize problems as much as possible Write programs to correct data if possible Flag cases where unsure
E N D
SHARE Data Cleaning Stephanie Stuck MEA Antwerp February 6th/7th 2008
Interviewer remarks • Categorize problems as much as possible • Write programs to correct data if possible • Flag cases where unsure • Collect information on questions that caused a lot of problems / didn’t work for working groups, country team, users • We do not need to translate all the remarks, but summarize important information • See also Laura’s presentation (Frankfurt)
General checks • Corrections based on checks of frequency distributions, e.g. outliers, values out of range • Corrections based on consistency checks • within and between modules and waves
More concrete • Check year of birth between coverscreen (cv_r and cv_h) and dn module, drop-offs and vignettes respectively, and possibly with the gross sample • Check gender CV/DN vs. drop-off/vignettes • Check for consistency of dates: • Check information on marital status: • Check respondent dummies • Check ch module against coververscreen • Check relation to coverscreen respondent • See also Omar’s presentations (Vienna, Frankfurt)
Financial modules Check financial amounts for implausible values, e.g. negative or very high amounts • outliers • zero values • wrong currencies • typing errors • consider frequencies of payments etc. • See also Dimitri’s and Mario’s presentations (Frankfurt)
Wrong sampid, cvid or respid MEA already checks automatically for mismatches within and between waves • Information from survey agencies about wrong ids, mismatches etc. ‘single cases’ • Please check if it is already corrected in the data • Write programs to correct remaining problem cases MEA will send you information about corrections done during fieldwork by CentERdata