1 / 35

UNITED NATIONS STATISTICS DIVISION Trade Statistics Branch Distributive Trade Statistics Section

Data sources and data compilation methods Workshop for African countries on the Implementation of International Recommendations for Distributive Trade Statistics 27-30 May 2008, Addis Ababa, Ethiopia. UNITED NATIONS STATISTICS DIVISION Trade Statistics Branch

Download Presentation

UNITED NATIONS STATISTICS DIVISION Trade Statistics Branch Distributive Trade Statistics Section

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data sources and data compilation methodsWorkshop for African countries on the Implementation of International Recommendations for Distributive Trade Statistics27-30 May 2008, Addis Ababa, Ethiopia UNITED NATIONS STATISTICS DIVISION Trade Statistics Branch Distributive Trade Statistics Section

  2. Outline of the presentation • Data sources for DTS – statistical surveys, administrative data sources and frames • Data compilation methods • Data collection strategy

  3. Data sources for compilation of DTS • Generation of DTS - based on data collected from numerous sources • Statistical data sources – data are collected specifically for statistical purposes • Administrative data sources - provide data created originally for purposes other than the production of statistical data

  4. Statistical data sources • Statistical surveys • Economic censuses – enumeration of all units in the population; basis for the establishment of BR; sampling frame for surveys • Sample surveys – collect responses from a few representative units scientifically selected from the population • Advantages of statistical surveys vs. administrative data sources • Planning, execution, data collection and processing procedures are under the control of the statistical office itself • Respondents have less reason to deliberately misreport the data as the NSO guarantees confidentiality • Disadvantages • Resource intensity (both financial and manpower) • Increase respondents burden • High non-response rates • Sampling errors

  5. Census of trade units (1) • Types • Part of an economy-wide census, including all economic activities • Independent census for distributive trade sector/activities only • Advantages • Tend to provide a complete enumeration of units engaged with trade activity, including units of informal sector at a particular point of time • Allow collection of DTS in great details that are required at longer intervals of time • Disadvantages • Limited in terms of data content • Census planning and organization and the subsequent transformation of census’s basic data into DTS • Time consuming and resource intensive exercise • Costly, imposes a high burden on respondents • Response rates may be reduced thus affecting quality of collected information

  6. Census of trade units (2) • Recommendations • Conduct of a complete census of trade units is recommended when: • A particular country does not maintain an up-to-date business register • There is a significant user interest for detailed statistical data by geographical area • Censuses should be followed as closely as possible by periodic (annual, quarterly or monthly) sample surveys • Censuses of trade units should not be conducted if there are other ways of collecting and producing distributive trade statistics of highly enough quality

  7. Sample surveys of trade units (1) • Technique for obtaining data about a large population of statistical units by selecting and measuring a limited number of units (sample) from that population • Conclusions about the total population of units are made on the basis of the estimates obtained from the sample • Scientific sample designs should be applied in order to reduce the risk of a distorted view of the population • Sample survey technique is a less costly way of data collection as compared to the census • It may be used in conjunction with a cut-off point or not

  8. Sample surveys of trade units (2) • Wholesale and retail trade sample surveys • Rarely restricted to one standard form • Tend to be a combination of forms, differentiated by periodicity and major characteristics of trade units • activity, size, legal form, type of operation and the type of variables • occasionally an extra characteristic, such as the geographical location of the unit, may influence the contents of a sample survey

  9. Sample surveys of trade units (3) • Size thresholds • Size of units plays an important role in determining the target population and, where relevant, the sample population of units • Most of the sample surveys are conducted for units above a certain size threshold • Reasons for using the threshold • Desire to limit the size of the survey • Reduce the response burden on businesses • Take account of the problems of maintaining registers for smaller units • Appropriate size threshold • No international recommendation • Decision is left to the judgment of each NSO • May vary between surveys for different trade activities and periodicity • Countries are encouraged to: • Make periodic assessments of the under coverage of the surveys due to the thresholds • Include a description of such thresholds in country’s metadata

  10. Types of DTS surveys (1) • Enterprise surveys • Sampling units comprise trade enterprises (or statistical units belonging to these enterprises) • Assume availability of a sampling frame of trade units • List-based frame – BR or census list • Area-based frame – a sample of areas is selected first, then the enterprises in it are enumerated • Recommendations • For surveys of distributive trade enterprises, the list-based enterprise surveys should be generally preferred to area- based surveys • List-based survey is more efficient from a sampling perspective in terms of sample size and maintenance of the list • Area based sampling is inappropriate for large or medium sized enterprises that operate in several areas • Area-based enterprise survey approach to be used for collection of data from small trade enterprises operating in informal or unorganized segment of the economy

  11. Types of DTS surveys (2) • Household surveys • Households are the sampled, reporting and observation units– ensures coverage of production by households enterprises that are too small to be recorded • Disadvantages of household surveys • Sample is not designed to provide a representative coverage of trade activities, but on the distribution of households • Distributions of households and trade activities are different, as trade activities tend to be concentrated in commercial and market zones • Recommendations • For coverage of unincorporated household enterprises which are not recognized as legal entities separately from their owners

  12. Types of DTS surveys (3) • Mixed household-enterprise surveys • A sample of households is selected and each household is asked whether any of its members own and operate an unincorporated enterprise • The list of enterprises thus compiled is used as the basis for selecting the enterprises from which desired data are finally collected • In contrast to household surveys they collect information about enterprises per se, not about the persons in a household, including their contribution to the enterprises • Disadvantages • Inefficiency of the sample design • Difficulties of handling enterprises with production units in more than one location • Recommendations • Preferred to household surveys or area-based enterprise surveys approach for collecting data and estimating the output of small trade units that are excluded from list-based enterprise surveys

  13. Administrative data sources (ADS) (1) • Set up in response to legislation and/or regulation • Each regulation results in a register of the units • Countries should use ADS for statistical purposes with caution • Privately controlled ADS • Data obtained from private sector data suppliers • Transfer of data from them to NSOs takes the form of a contract with a payment of a fee • Recommendations • Compilers of DTS should identify and review the available ADS in their countries and use the most appropriate of them for compiling DTS

  14. Administrative data sources (2) • Advantages • Complete coverage of units and perceived as low non-response • Avoidance of response burden • Cheaper for NSOs to acquire data from an ADS than to conduct a survey • Suitable for covering the smallest segment of units population which contributes relatively little to the estimates but makes up a substantial percentage of the number of units in the population • Smaller sampling errors than in survey, better accuracy • Disadvantages • Discrepancy between administrative and statistical concepts • Poor integration with other data of the statistical system • Risks with respect to stability • The level of scrutiny to variables that are of statistical interest may not be satisfactory • Data may become available with unacceptable delay • Legal constraints with respect to access and confidentiality

  15. Business register • Business register (BR) - recommended as the most appropriate source for deriving sample frame for distributive trade surveys • Organization and conduct of any enterprise survey of distributive trade units assumes availability of an adequate sampling frame • Sampling frame - set of units subject to sampling together with the details about them that will be used for stratification, sampling and contact purposes • Statistical business register • Comprehensive list of all enterprises and other units together with their characteristics that are active in a national economy • A tool for the conduct of statistical surveys as well as a source for statistics in its own right • Operationalises the selected model of statistical units and facilitates classification of units according to the agreed conceptual standards for all surveys

  16. Statistical business register (1) • Establishment • Available administrative registers - starting point for the establishment of Statistical BR • If only one administrative register is used, the resulting Statistical BR would likely to be deficient in terms of coverage and content and would not provide an adequate sampling frame for subsequent statistical surveys • Countries are encouraged to work towards improvement of the coverage and content of their Statistical BR by incorporating data from several administrative sources • Need of a single business number for all enterprises • Maintenance • Should be up-to-date and with satisfactory quality • Should be regularly maintained and updated to take note of the changes in the enterprise dynamics

  17. Statistical business register (2) • Sources for the establishment and maintenance of Statistical BR • Economic census - provide the most comprehensive list of units and links between them in a given country • Administrative data sources - VAT tax and payroll tax systems, records maintained by the governments for the administration of unemployment insurance, social security or other programmes • Feedback from enterprise surveys - provide new information on contact address changes, closure of business, change in the economic activity of the unit, etc. • Business register surveys - profiling of enterprises • Other potential sources - information from trade associations, telephone directories or special listings prepared by telephone companies, etc.

  18. Profiling of enterprises

  19. Data compilation methods • Process of data compilation • Comprises more than just aggregating the questionnaire items • Statistical offices perform a number of checks, validation and statistical procedures to bring the collected data to the level of the intended statistical output • DTS respondents - prone to commit errors while completing a statistical questionnaire • DTS data collected trough statistical surveys - affected by response and non-response errors of different kinds

  20. Data validation and editing (1) • Integral part of all types of statistical surveys data processing operations • Needed to solve problems of missing, invalid or inconsistent responses • Editing • Systematic examination of collected data for the purpose of identifying and eventually modifying the inadmissible, inconsistent and highly questionable or improbable values, according to predetermined rules • Essential process for assuring quality of the collected information • Types of editing • Micro editing (input editing) - focuses on the editing of an individual record or a questionnaire • Macro editing (output editing) – checks are performed on aggregated data

  21. Data validation and editing (2) • Selective editing • Approach for prioritizing and further reducing costs of editing • Targets only those of the micro data items or records that would have a significant impact on the distributive trade surveys results • Recommended for editing of distributive trade data • Influential observations • Particular data item responses that have most significant impact upon the main estimates • Editing efforts should be focused on them

  22. Data validation and editing (3) • Edit checks for detecting errors in distributive trade data • Routine checks - test whether all questions have been answered • Validation checks - test whether answers are permissible • Rational checks - set of checks based on the statistical analysis of respondent data • Plausibility checks – used to pick up large random errors

  23. Imputations (1) • Missing data • Encountered in most of the trade surveys • Create problems for data editing • Types of missing data • Item non-response - data for a particular data item of the questionnaire is missing • Unit non-response - selected unit has not returned the filled-in questionnaire • Techniques for dealing with missing data • Imputations • Re-weighting

  24. Imputations (2) • Replace one or more erroneous responses or non-responses in a record with plausible and internally consistent values • Process of filling gaps and eliminating inconsistencies • Means of producing a complete and consistent file containing imputed data • Used mainly for estimating missing data in case of item non-response • Substitution - used in the case of unit non-response • Data from previous available periods of that unit • Data available for that unit from administrative information

  25. Imputations (3) • Commonly used imputation methods • Subjective treatment • Mean/modal value imputation • Post stratification • Substitution • Cold deck - makes use of a fixed set of values, which covers all of the data items • Hot deck - replaces each missing value by the available value from a 'donor', i.e. a similar participant in the same survey • Nearest-neighbour imputation or distance function matching • Sequential hot deck imputation • Regression (model based) imputation

  26. Item non-response • Strategies for dealing with item non-response • The analysis is confined to the fully completed forms only as all forms with missing values are ignored • Not recommended because even the valid data contained in the partially complete formsare discarded • Missing data are imputed so that the data matrix is complete

  27. Unit non-response • Non-response may occur due to: • Non existence of the unit included in the survey • Lack of appreciation of the importance of the data on part of the respondents • Refusal to respond • Lack of knowledge how to respond • Lack of resources • Non-availability of the desired information • Ways to minimize the non-response • Increase the awareness among respondents about importance of surveys • Appeal to the respondents to cooperate with the statistical authorities • Reminders to the non-respondents and resorting to the enforcement measures laid down in the national legislation • Strategies for dealing with unit non-response • Re-weighting - the sample is re-weighted as to include only the responding sample units • Various forms of imputations – similarly to those used for item non-response

  28. Data collection strategy (1) • DTS surveys and/or administrative data sources should cover all units in the economy engaged in economic activities within the scope of the distributive trade sector (Section G of ISIC, Rev.4) • Units of all sizes and types including corporations and unincorporated (household) units • Data collection strategy • NSOs should develop their own data collection strategy • Ensures a complete coverage of distributive trade activity • Based on an integrated approach covering in principle all trade units across all class sizes enterprises • Commensurate with their specific statistical and organizational circumstances

  29. Data collection strategy (2) • Public incorporated enterprises • A directory of such units is available in most of the cases • To be covered on a complete enumeration basis • Private and foreign controlled incorporated enterprises • Large-scale units • To be covered on a complete enumeration basis if possible • Others • Tend to be significant in numbers but relatively homogenous • To be covered through a sample survey • Small enterprises • Sample surveys - if these are on the statistical BR or through the use of administrative data (tax returns of small enterprises) • Fully Integrated Rational Survey Technique (FIRST) - if the register of unincorporated enterprises is not available

  30. Data collection strategy (3)

  31. FIRST (1) • Survey programme that efficiently capture comprehensive statistical information from all distributive trade enterprises operating in an economy • Application • Requires two basic statistical sets of information • Census enumeration, preferably an economic census - to establish the complete statistical population of units for construction of sampling frame and sample selection • Population census – alternative in the absence of economic census • Supporting documentation on sample areas/enumeration blocks for the benchmark enumeration • Divides the units into two segments • List-frame segment – comprises of relatively small number of large units • Non-list-frame segment – includes all remaining units that can be covered only by an (geographical) area frame approach

  32. FIRST (2) • List-frame segment • Population of units tends to be very heterogeneous in its size and characteristics • Surveys are drawn from a BR or a directory of units • Non-list-frame segment • First stage - a sample of area units is selected • Second stage - a list of all establishments operating in each of the selected in the first stage unit is identified • Establishments falling in the scope of DTS are classified by kind-of-activity • Sample of units is drawn from the listed establishments • Mobile units • All identifiable establishments outside the owners’ home located in the selected area unit as well as household-based enterprises located within home - listed by a house-to-house visit

  33. Distributive trade surveys • Annual surveys • Should provide estimates that cover all wholesale and retail trade establishments • Comprehensive surveys are not always necessary • All establishments above a given cut-off point may be completely enumerated, while the others may be sampled • All units may receive a survey form, but an abbreviated version may be used for the small establishments • Estimates for the small establishments may be made from administrative data or from other statistical inquiries such as mixed household-enterprise surveys • Infra-annual surveys (quarterly or monthly) • More restricted coverage than annual surveys • Small establishments - coverage is subject to their significance and availability of reliable administrative data source • Infrequent surveys (5-10 years) • Used for collection of data on specialised topics or in greater details • Not appropriate for the purpose of collecting and compiling structural type DTS

  34. Reference period • Reference period for annual surveys • Data compiled in annual surveys should relate to a 12-month period • 12-month period should preferably be the calendar year • Other options • Data are more readily available on a different fiscal-year basis for some establishments • Some data items (wages and salaries) have to be collected on both a fiscal-year and calendar-year basis to facilitate building up calendar year aggregates • Data for all establishments are available on a fiscal year basis which become the normal accounting period • Reference period for infra-annual surveys • Corresponding calendar month/quarter - recommended as the reference period for infra-annual surveys • Other options • Some establishments work in quarterly periods of four, four and five weeks • NSO should make every efforts to standardize the information provided in the monthly returns by some estimation procedures

  35. Thank You

More Related