1 / 73

Practical statistics for Neuroscience miniprojects

Practical statistics for Neuroscience miniprojects. Steven Kiddle Slides & data : http :// bit.ly /1Jaor2r. We are unlikely to finish all the slides Keep them, they may be helpful for your miniproject. Lecture outline. Taught component How to present statistics Hypothesis testing

bibiane
Download Presentation

Practical statistics for Neuroscience miniprojects

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Practical statistics for Neuroscience miniprojects Steven Kiddle Slides & data : http://bit.ly/1Jaor2r

  2. We are unlikely to finish all the slidesKeep them, they may be helpful for your miniproject

  3. Lecture outline • Taught component • How to present statistics • Hypothesis testing • Normal distribution • Practical component • Plots • Statistical tests • Multiple testing corrections

  4. Taught component

  5. Why are statistics important? • Help to make science more repeatable and objective • Help you to interpret your results • Help you to assess the level of evidence you have supporting a hypothesis • A vital skill for a scientific career!

  6. How to report statistics • Always report: • Statistical software you used • Statistical tests you used • Significance level you used • Sample size I checked these in 17 randomly chosen neuroscience project posters

  7. How not to report statistics! • I found that: • 16/17 didn’t report the statistical software used • 11/17 didn’t report the statistical tests used • 9/17 didn’t report the significance level used • 2/17 didn’t report the sample size!

  8. Commonly used analysis methods • Plotting: • Box plots • Line plots • Hypothesis testing • T-test • ANOVA • Chi-squared

  9. Hypothesis testing • Two types of hypothesis • Null hypothesis (H0) • Usually that there are no differences between groups or that two variables are unrelated • Example : (H0)Smoking and lung cancer are unrelated • Alternative hypothesis (H1) • There are differences between groups, or that two variables are related • Example : (H1) Smoking and lung cancer are associated

  10. Significance levels • You accept the alternative hypothesis if the chance of your data being generated under the null hypothesis (the ‘p-value’) is beneath a pre-specified significance level α • Typically α = 0.05 • You should state the significance threshold you use in your report

  11. Multiple hypothesis testing I • Suppose you have a significance threshold of α = 0.05 • Suppose that you measure 100 variables that are NOT related to a disease • You perform 100 hypothesis tests to compare your variables to disease state • H0 : Variable is not affected by disease state • H1 : Variable is affected by disease state • For how many variables do you expect to reject the null hypothesis (H0) even though its true?

  12. Multiple hypothesis testing II • α = 0.05 means that if the null hypothesis (H0) is true, we would expect to reject it 5% of the time • So if H0 is true and we did 100 tests, we would expect to reject H0 5 times by chance alone • That is bad, these findings will not replicate • How do we stop it? • Multiple testing corrections

  13. Multiple testing corrections • Bonferroni correction • If we want α = 0.05 , instead use α = 0.05/n where n is the number of tests you want to use • So for 100 tests, we would use α = 0.0005, and would only have 5% chance of any test rejecting the null hypothesis • Benjamini-Hochberg correction • Popular alternative

  14. Normal distribution

  15. Tests that rely on assumptions of normality • T-tests • ANOVA / linear models

  16. How to check if you data is normally distributed • Histograms • Statistical tests • Can apply to data • But better to apply to residuals of the models • For t-test, that means looking at the groups separately • For ANOVA, that means extracting residuals from the model

  17. What do you do if your data is not normally distributed? • If sample size is really small • Nothing you can do – use test anyway • If data is skewed • Transform data (e.g. log? square root?) • Use non-parametric tests • Mann-whitney U instead of T-test • Spearman’s Rank Correlation • Last resort - Remove outliers? • Systematically and preferably only if you know what causes them

  18. How to present plots • Label both axes • Large enough to read • Show units • If using stars (*) for significance levels, explain what *, **, *** means Lunnon et al., (2012) Journal of Alzheimer’s Disease

  19. How to present statistics I • Say what statistical software used,e.g. • SPSS, STATA, R, MATLAB, etc • Say what the sample size is • Say what statistical test is being performed • T-test, ANOVA, chi-squared, etc • Say what significance level you are using for the study • Think, is it appropriate given my sample size and number of hypotheses being tested?

  20. How to present statistics II • Report p-value • And/or multiple testing corrected p-value • E.g. Q-values for Benjamini-Hochberg • Report coefficient (β), and ideally it’s standard error for each reported statistic • This can be more informative than a p-value, especially for small datasets

  21. How to present statistics III A more complete guide, tailored to SPSS and specific tests is given at: http://statistics-help-for-students.com/

  22. Be cautious in your interpretations • Correlation does not equal causation! • Can you hypothesise a mechanism by which causation could occur?

  23. Why does correlation not equal causation? • It looks like the variables are correlated when they are not • How this happens? • By chance, especially when multiple testing is performed but not corrected for • Variables are truly correlated but there is either: • Reverse causation • Confounding by other variables

  24. Confounding

  25. Statistical software • Excel • Point and click, quite limited • SPSS • Point and click, a little limited • STATA • Command line • R, MATLAB, etc • Command line, very useful, steep learning curve

  26. http://www.r-project.org/http://www.rstudio.com/

  27. R introduction

  28. Practical component - SPSS Data is faked to show large differences, real data will not be so clear cut

  29. Outline • Data • Tests for normality • Plots • T-test • ANOVA • Chi-squared • Non-parametric tests

  30. Data • Create folder in ‘My Documents’ • Download data and save in your new folder: Slides & data : http://bit.ly/1Jaor2r • Unzip ‘neuroscience_example.sav_.zip’ • Double click on ‘neuroscience_example.sav’ to open SPSS

  31. Introduction to the data • 5 variables • a , b , c , d , e • 2 are binary • a , b • 3 are continuous • c , d , e

  32. Normality checks I • Need to check data is normally distributed when we want to apply • T-test • ANOVA • Linear regression

  33. Normality checks II Let’s see if the variable ‘d’ is normally distributed

  34. Normality checks III Rejects the null hypothesis that the data is normally distributed Can see that the data has two peaks

  35. Normality checks III Now we take into account variable ‘a’, we find that ‘d’ is normally distributed when we take into account ‘a’

  36. Plots • Histograms (shown in normality check) • Show distribution of a continuous variable • Boxplots • Show the distribution of a continuous variable between groups • Line plot/scatter plot • Shows the relationship between two continuous variables

  37. Generating a boxplot I

  38. Generating a boxplot II

  39. Generating a boxplot III

  40. Generating a boxplot IV Double click on plot to label axis

  41. Labelling a plot I Double click to change label

  42. How to present plots • Label both axes • Large enough to read • Show units • If using stars (*) for significance levels, explain what *, **, *** means Lunnon et al., (2012) Journal of Alzheimer’s Disease

  43. Labelling a plot II

  44. Saving a plot – TO DO

  45. Boxplot exercises • Make a few more boxplots comparing binary variables to continuous variables • Try adding labels • Try saving • Try to interpret the boxplot • Do you see differences between the groups

More Related