1 / 28

Promoting Good Statistical Practices

Promoting Good Statistical Practices. Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005. Contents. Understanding the present situation : The need for (basic) training in statistics Past training in statistics Developments in statistical computing

jaser
Download Presentation

Promoting Good Statistical Practices

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

  2. Contents • Understanding the present situation: • The need for (basic) training in statistics • Past training in statistics • Developments in statistical computing • And in statistical analyses • Possibilities for the future • Resources • statistical software (freely available in Africa) • materials to promote good statistical practices • training materials • Spatial analysis • In conclusion • These are exciting times - let’s look forwards not backwards PROMOTING GOOD STATISTICAL PRACTICE

  3. Training in statistics • It is difficult to practice good statistics • unless we have had appropriate training • For example seasonal forecasting • Uses PCA • Spatial methods mentioned in this workshop include: • Kriging, and co-kriging • PCA and clustering • When many staff find more basic concepts difficult • Percentiles and return periods – (show CAST as preview) • Standard errors, etc • So they have to accept (advanced) methods in an unquestioning way PROMOTING GOOD STATISTICAL PRACTICE

  4. Past training in statistics • Training for (non-statistician) users in the past has been problematical • consequently they fear statistics • and hence also statisticians • Similarly, insufficient soft training for statisticians • consequently they sometimes lack communication skills • and marketing skills • and are often side-lined in important development and research projects • just like Met staff perhaps??? PROMOTING GOOD STATISTICAL PRACTICE

  5. Common training problems for non-statisticians • Training is dominated by analysis • with little on data management • or on design • A recipe-book approach is used • hence e.g. overuse of irrelevant significance tests • little understanding of principles • Training emphasises hand computation • for understanding (which they don’t get!) • but not needed later • and little experience of computers for statistical work • Presentation is too mathematical • not conceptual AND often taught by someone who has little interest in the student’s main subject areas PROMOTING GOOD STATISTICAL PRACTICE

  6. RESULT! • Users with near universal dislike of statistics • and statisticians? • strong demand for relevant in-service training in statistics • Most of these past weaknesses in training • are the same for statisticians • who can be too pedantic and inflexible in their advice • and are then feared and ignored, where possible, by potential clients • We see later how this can now easily change • for both statisticians • and for others who need to generate and use statistics PROMOTING GOOD STATISTICAL PRACTICE

  7. Advances in statistical computing • History • 1960’s SAS and SPSS started • A long way back in computer terms • By early 1980’s • Statistics packages well established • Micro-computers appeared – too small for these packages • So lots of other statistics packages • that made the same mistakes as SAS and SPSS a generation earlier • it is easy to write statistical software, but difficult to write good software PROMOTING GOOD STATISTICAL PRACTICE

  8. Statistics packages : THEN • In the 1990’s • Standard statistics packages dominant again • compare other types of software • With some additions e.g. Stata • All command-driven • So you had to learn the language (for SPSS, or SAS) • So people and training courses used just one package • Data transfer between packages was difficult • Training courses often confused • learning the package with learning statistics • c.f. data management – learning concepts or learning Access PROMOTING GOOD STATISTICAL PRACTICE

  9. A big advance….. Windows appeared & EXCELruled the world for better for worse! PROMOTING GOOD STATISTICAL PRACTICE

  10. Statistics packages : NOW • All common packages are in Windows • Very similar interface • Like other Windows software • So very easy to learn • And to add to Excel • so you can still keep your “security blanket” • And easy to add another package • hence not so critical what package is used for statistics training • Data transfer has also become easy • Hardly need a training course • for the software • so can concentrate on training in statistics again! PROMOTING GOOD STATISTICAL PRACTICE

  11. Advances in statistical analysis • The “estuary model” • ever-increasing unity to the methods • this makes training much easier • if we build a solid foundation • special methods are then seen as such PROMOTING GOOD STATISTICAL PRACTICE

  12. Start in 1960’s • In the mountains there were little streams • Regression and • Analysis of variance • These were for normally distributed data • In another valley • parameter estimation was for other distributions, like Poisson and binomial • And leading to another valley • the chi square analysis for categorical data PROMOTING GOOD STATISTICAL PRACTICE

  13. Then • In the late 1960’s • Chi-square tests joined with other ways of looking at multidimensional contingency tables • to become log-linear models • In the early 1970’s • log-linear models • joined probit analysis • into the general stream of generalized linear models • that also included ANOVA and regression • for normal and non-normal data PROMOTING GOOD STATISTICAL PRACTICE

  14. And finally for us here • In the 1980’s • REML started • and is for data at multiple levels • By the 1990’s it had joined the mainstream • and included powerful methods for spatial modelling • So now • same modelling ideas used for a wide range of problems • Making both training and analysis • simpler and more coherent • as long as the trainers know. BUT some are still up in the mountains! PROMOTING GOOD STATISTICAL PRACTICE

  15. So where are we now? • Statistical software has developed • and so has user’s computing skills • Statistical methods have developed • and are easier to use • And the resources to bring the two together • are now being made available • and are becoming accessible throughout • We describe some of these resources • First generally • And then look briefly at methods for spatial modelling PROMOTING GOOD STATISTICAL PRACTICE

  16. Software includes: • SSC-Stat • add-in for Excel to encourage good use • with a tutorial guide • and guides for good tables and good graphs • for example it provides boxplots • Instat+ • first simple statistics package for ‘Excel-lers’ • supports good teaching of statistics • stepping stone to other statistics packages • tutorial guide, introductory guide • and climatic guide, now updated for Instat Version 3 • for example for data summary or training • Genstat • One of the major statistics packages (like SPSS, Systat) • For modern statistical modelling, like GLMs and REML • And good facilities for spatial modelling PROMOTING GOOD STATISTICAL PRACTICE

  17. Genstat • Specially for agricultural applications • And now with added climatic features • Like extremes, and circular plots • Plus a climatic guide PROMOTING GOOD STATISTICAL PRACTICE

  18. Resources for good statistical practice • Good practice guides • Mini-guidesfor statistical sceptics • designed originally to promote good statistical practice in DFID projects • covering design, data management analysis and presentation • a book is now available • And so much more: • Participatory (QQA) stuff, important for Met services • Now a book is available, based on Malawi’s “starter pack” • Data management – where Met services can support other groups PROMOTING GOOD STATISTICAL PRACTICE

  19. Training resources include • Statistical games to help teach statistics • Reading and BUCS • For example PADDY, the rice survey game • Materials for distance learning • Now CAST in general • But can now be adapted for African needs • With support from the Rockefeller foundation PROMOTING GOOD STATISTICAL PRACTICE

  20. Interesting ways of learningTraining software • Statistics concepts through CAST PROMOTING GOOD STATISTICAL PRACTICE

  21. Interesting ways of learningStatistical Games • Simulating a survey based on a real crop cutting survey in Sri Lanka PROMOTING GOOD STATISTICAL PRACTICE

  22. And in climatology • Providing the basic statistical skills • Now through a facilitated e-learning course • Tested in 2005, and provided from 2006 • For staff in HQ and (hopefully) in outstation offices • Because decentralisation is important • Using a specially adapted version of CAST • That can be provided to African Services • You have seen this earlier • Also software (Instat) plus Genstat • Each with their special climatic guide PROMOTING GOOD STATISTICAL PRACTICE

  23. Spatial ideas • More to spatial analysis than just maps • Remember the data – when will you map? • Daily – many “layers” • Annually (e.g. date of start of the season) • Averages – take care of different years at different stations • Example where map does not give the full answer • Southern Zambia – risky for maize • Suggest strategy – say farmers overall have 20% (1 year in 5) risk of replanting • How much seed should be stocked? • Map – very simple 20% everywhere – does it answer the question? • Need spatial correlations – why? PROMOTING GOOD STATISTICAL PRACTICE

  24. GIS and mapping • Many problems can be mapped effectively • Then much “spatial analysis” is descriptive statistics • Selection of subsets, • Transformations to provide new layers • Logical calculations • Etc • This is non-controversial • Simple smoothing to provide contours is the same • As long as the spatial “averaging” e.g. splines, inverse distance is recognised as such • But kriging, etc is moving into inferential ideas • And statistical packages could also be used for such operations PROMOTING GOOD STATISTICAL PRACTICE

  25. Spatial statistics with statistical software • Many statistical packages, e.g. Genstat • Provide some facilities for spatial analysis • For example kriging • And REML – for the future PROMOTING GOOD STATISTICAL PRACTICE

  26. Demonstration • Show two examples of Genstat • First is a simple contour plot • Shows the value of a log file of commands • Second is an example of kriging • Shows more facilities in fitting and plotting • Other facilities include • Co-kriging • REML for “proper” spatial modelling • Within which kriging is a special case • More “research” and case studies are needed PROMOTING GOOD STATISTICAL PRACTICE

  27. In conclusion • The time is right: • Statistics has changed • Training methods can change • The resources are here • And in Africa: • Evidence-based decision making is (more) encouraged • Met Services are key organisations • Because climatic data are needed in so many applications Challenge: How will you proceed?? PROMOTING GOOD STATISTICAL PRACTICE

  28. Thank you

More Related