1 / 44

The ERP Boot Camp

The ERP Boot Camp. Plotting, Measurement, & Statistics. Plotting- The Right Way. To-be-compared waveforms overlaid. Legend in figure. Time Zero. Time ticks on baseline for every waveform. Electrode Site. Voltage calibration aligned with waveform. Calibration size and polarity.

russ
Download Presentation

The ERP Boot Camp

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The ERP Boot Camp Plotting, Measurement, & Statistics

  2. Plotting- The Right Way To-be-compared waveforms overlaid Legend in figure Time Zero Time ticks on baseline for every waveform Electrode Site Voltage calibration aligned with waveform Calibration size and polarity Baseline shows 0 µV

  3. Plotting- Basic Principles • You must show the waveforms (SPR rule) • You need to show enough sites so that experts can figure out underlying component structure • I often show just one site for a cognitive audience when component can be isolated (N2pc or LRP) • In most cases, don’t shown more than 6-8 sites (topo map instead) • A prestimulus baseline must be shown • Usually 200 ms (minimum of 100 ms for most experiments) • If you don’t see a baseline, the study is probably C.R.A.P (Carelessly Reviewed Awful Publication) • Overlay the key waveforms • In most cases, show both original waveforms and difference waves

  4. Measuring ERP Amplitudes • Basic options • Peak amplitude • Or average around peak • Or local peak amplitude • Mean/area amplitude

  5. Why Mean is Better than Peak • Rule #1: “Peaks and components are not the same thing. There is nothing special about the point at which the voltage reaches a local maximum.” • Mean amplitude better characterizes a component as being extended over time • Peak amplitude encourages misleading view of components • Peak may find rising edge of adjacent component • Can be solved by local peak measure • Peak is sensitive to high-frequency noise • Can be mitigated by low-pass filter or “mean around peak” • Time of peak depends on overlapping components • The peak may be nowhere near the center of the experimental effect

  6. Why Mean is Better than Peak • Peak amplitude is biased by the noise level • More noise means greater peak amplitude • Mean amplitude is unbiased by noise level • Example • Do 1000 simulation runs at two noise levels • Take mean amplitude and peak amplitude on each run • Average of 1000 mean amplitudes will be approximately the same for high-noise and low-noise data • Average of 1000 peak amplitudes will be greater for high-noise data than for low-noise data

  7. Peak Amplitude and Noise Clean Waveform Waveform + 60-Hz Noise

  8. Why Mean is Better than Peak • Peak at different time points for different electrodes • A real effect cannot do this • A narrower measurement window can be used for mean amplitude • Mean amplitude is linear; peak amplitude is not • Mean of peak amplitudes ≠ peak amplitude of grand average • Mean of mean amplitudes = mean amplitude of grand average • Same applies to single-trial data vs. averaged waveform

  9. Shortcomings of Mean Amplitude • You will still pick up overlapping components • A narrower window reduces this, but increases noise level • Different measurement windows might be appropriate for different subjects • This could be a source of measurement noise • Patients and controls might have different latencies, leading to a systematic distortion of the results • This is a case where peak might be better • How do you pick the measurement window? • Using the time course of an effect biases you to find a significant effect • Reality: People often look at the data first • Alternative 1: Select window based on prior results • Alternative 2: “Functional localizer” condition to find “ROI” • Alternative 3: Resampling/randomization approaches

  10. The Baseline (reminder) • Baseline correction is equivalent to subtracting baseline voltage from your amplitude measures • Any noise in baseline contributes to amplitude measure • Short baselines are noisy • Usual recommendation: 200 ms • Need to look at 200+ ms to evaluate overlap and preparatory activity • Baseline can be significant confound • Baselines may differ across conditions due to overlap or preparatory activity, and this activity may fade over time • A poststimulus amplitude measure may therefore vary across conditions due to differential baselines • Fading prestimulus differences can also distort scalp distributions • Distribution of prestimulus period contributes to distribution

  11. Measuring Midpoint Latency • Basic options • Peak latency • Or local peak latency • 50% area latency

  12. Better Example of 50% Area Rare Minus Frequent

  13. Shortcomings of Peak Latency • Peak may find rising edge of adjacent component • Can be solved by local peak measure • Peak is sensitive to high-frequency noise • Can be mitigated by low-pass filter • Time of peak depends on overlapping components • Terrible for broad components with no real peak • Biased by the noise level • More noise => nearer to center of measurement window • Not linear • Difficult to relate to reaction time

  14. 50% Area Latency • Uses entire waveform in determining latency • Robust to noise • Not biased by the noise level • Works fine for broad waveforms with no real peak • Linear • Easier to relate to RT • Almost the same as median • Shortcomings • Measurement window must include entire component • Strongly influenced by overlapping components • Requires monophasic waveforms • Works best on big components and/or difference waves

  15. Relating Midpoint Latency to RT Probability Distribution of RT 17% of RTs at 350 ms 25% of RTs at 400 ms Probability of Reaction Time 7% of RTs at 300 ms Time

  16. Relating Midpoint Latency to RT Peak latency is related to mode of RT distribution, not mean or median ERP Amplitude Time

  17. Relating Midpoint Latency to RT Typical RT probability distributions across different conditions P3 peak latency usually differs less across conditions than mean RT

  18. 50% Area Latency Example Luck & Hillyard (1990)

  19. 50% Area Latency Example Luck & Hillyard (1990)

  20. Measuring Onset Latency • Basic options for onset of component • 20% area latency • 50% peak latency • Statistical threshold • First of N consecutive p<.05 points Peak amplitude 50% of peak amplitude Latency @ 50% of peak amplitude

  21. Jackknife Approach • Miller, Patterson, & Ulrich (1998) • Hard to measure onset latency (and other nonlinear parameters) from noisy single-subject waveforms • Much easier to measure from grand average • Measure from grand average of N-1 subjects N times (once excluding each subject) • Variance will be artificially low but can be corrected • Fcorrected = Funcorrected ÷ (N-1)2 [N per condition] • Between, within, main effects, interactions • Jackknife can also be used with Pearson r • So precise that you may need to use interpolation to measure latencies between sample points

  22. Jackknife Approach 50% fractional peak latency Grand w/o Subject 1 Subject 1 Grand w/o Subject 2 Subject 2 Grand w/o Subject 3 Subject 3

  23. Jackknife Approach • Conventional ANOVA on LRP onset latency • F(1, 20) = 1.315, p = 0.258 • Jackknife ANOVA on LRP onset latency • F(1, 20) = 5221.625, Fc = 13.05, p = .0017 • Limitations • Doesn’t help with linear measures • Easier to have equal Ns for between-subjects ANOVAs • Is sometimes worse than conventional approach • Testing a slightly different null hypothesis

  24. Jackknife Approach • Conventional null hypothesis • If you measure from every individual in the population, the average of these measures does not differ across conditions • Jackknife null hypothesis • If you make grand averages across every individual in the population, and measure from these grand averages, these measures do not differ across conditions • Making a grand average leads to the same problems as averaging across trials • Greater latency variability across subjects in one group will lead to lower peak amplitude in this group’s grand average • The onset time in the grand average will reflect the onset times of the subjects with the earliest onset times • Think about it, and make sure you get the same general pattern with conventional statistics

  25. Jackknife Approach Condition A Condition B Sub1 Sub2 Sub3 Sub4 Sub2 Sub3 Sub1 Sub4 Stim Mean of single-subject values Stim Mean of single-subject values

  26. Jackknife Approach Condition A Condition B Sub1 Sub2 Sub3 Sub4 Sub2 Sub3 Sub1 Sub4 Value from grand average Value from grand average Stim Mean of single-subject values Stim Mean of single-subject values A difference in timing variability is misconstrued as a difference in mean onset time

  27. Statistical Analysis • Replication is the best statistic • The .05 threshold is arbitrary • What would happen if we decided the threshold should be .06? • We regularly violate the assumptions of statistical tests, so the computed p-values are not correct estimates of probability of a Type I error • The real question is whether the effects are real or noise • If they are real (and large enough), they will be replicable • General advice • Collect clean data with big effects • Run follow-up experiments that contain replications • Use a vanilla statistical approach (with jackknife approach for nonlinear measures, when appropriate) or • Find a really good statistician who can do the most appropriate statistical tests

  28. Standard Approach • First, collapse across irrelevant factors • If target and standard are counterbalanced, collapse to avoid physical stimulus differences • This reduces number of ANOVA factors • Fewer p-values • Fewer spurious interactions • Smaller experimentwise error • Do a separate ANOVA for each component • Don’t use component as a repeated-measures factor • Separate ANOVAs for amplitude and latency • You could do a gigantic MANOVA, but it would have a zillion p-values

  29. Standard Approach • Use electrodes at which component is present • Otherwise your effect may get swamped by noise at other electrodes • Interaction with electrode site has low power • Electrode site is usually two factors • Anterior-posterior • Left-middle-right • Or clusters (averages across nearby electrodes) • Usually bad to do a separate ANOVA for each site • More p-values means greater chance of Type I error • Less power means greater chance of Type II error • Overall advice: Use stats in a way that most directly tests your main hypotheses

  30. Choosing Electrode Sites • Imagine you are comparing Condition A and Condition B at 128 electrode sites, and the conditions do not actually differ (zero difference with infinite power) • If the noise is independent at each site, you would expect p < .05 for 6-7 sites (.05 x 128 = 6.4) • If noise is correlated among nearby sites, you would expect p < .05 for at least one cluster of sites • Therefore, if you choose which sites to measure by seeing which sites (or clusters) show a difference, you will have many false positives (actual p >> .05) • Solution 1: All sites in an omnibus ANOVA (low power) • Solution 2: Bonferonni correction (even lower power) • Solution 3: Use false discovery rate correction (not quite as bad) • Solution 4: Use a priori region of interest • Solution 5: Use “functional localizer” condition • Solution 6: Use resampling/randomization approaches

  31. Example: Fishing for N2ac 2 simultaneous stimuli on each trial, selected from: A) Pure sine wave B) FM sweep C) White noise burst D) Click train Duration=750, SOA = 1500±150 One stimulus defined as target for each trial block (e.g., FM sweep) Task: Press one button for target-present, another for target-absent Each stimulus equally likely to be combined with each other stimulus Locations are randomized from trial to trial Target is present on 25% of trials Look at contra vsipsi with respect to target

  32. Example: Fishing for N2ac

  33. Example: Fishing for N2ac

  34. Example: Fishing for N2ac Separate ANOVAs for anterior and posterior electrode clusters Factors: Contra/Ipsi, Hemisphere, Within-Hemisphere Site, Time

  35. Example: Fishing for N2ac Key Effects Contra/Ipsi: Significant Contra/Ipsix Time: Significant Contra/Ipsix Electrode: ns Contra/Ipsix Hemisphere: ns Key Effects Contra/Ipsi: ns Contra/Ipsix Time: Significant Contra/Ipsix Electrode: ns Contra/Ipsix Hemisphere: Significant

  36. Example: Fishing for N2ac Contra/Ipsi @ Each Time Interval 200-300: Significant 300-400: Significant 400-500: Significant 500-600: ns Contra/Ipsi @ Each Time Interval 200-300: ns 300-400: ns 400-500: Significant 500-600: Significant

  37. Example: Fishing for N2ac Follow-Up Experiment: Same basic paradigm to demonstrate replicability Slightly different stimuli to demonstrate generality Additional anterior electrode sites to better map scalp distribution Also included unilateral stimuli to determine whether the N2ac requires competition between simultaneous stimuli Replicated basic anterior and posterior patterns These effects were not present for unilateral stimuli

  38. Electrode Interactions • Amplitudes are multiplicative across electrodes • Fz amplitude might go from 1.0 µV to 1.5 µV, and Pz amplitude might go from 2 µV to 3 µV Multiplicative Additive • This produces a condition x electrode site interaction • Even without a change in neural generators

  39. Electrode Interactions • McCarthy & Wood (1985): Normalize the Data • Divide by vector length • Now the conditions have the same overall amplitude • Main effects are eliminated; they are assessed prior to normalization

  40. Electrode Interactions • Technical Problem: Urbach & Kutas (2002) demonstrated that this does not actually work under many realistic conditions • Many of these problems disappear if you measure from difference waves • Conceptual problem: The conclusions that can be drawn from an electrode site interaction are extremely weak • Could be same generators, but change in relative amplitudes • Could be same generators, but a change in relative latencies • General advice: Don’t worry about electrode interactions • You can’t draw very strong conclusions from them, so just report them

  41. Heterogeneity of Covariance • Within-subjects ANOVA assumes homogeneity of variance and covariance (sphericity) • Modest heterogeneity of variance not a big problem • Heterogeneity of covariance inflates Type I error rate • What is homogeneity of covariance? • 3 or more levels of a within-subjects factor • Each level must be equally correlated with the other levels

  42. Heterogeneity of Covariance Within-Subjects ANOVA assumes: Covariance(A, B) = Covariance(B, C) = Covariance(A, C) Subject 1 Subject 2 Subject 3 Cond A Cond B Cond C

  43. Heterogeneity of Covariance • Why is this a special problem for ERPs? • Covariance is lower for more distant electrode pairs than for nearby electrode pairs • Whenever 3 or more electrodes are used, heterogeneity of covariance is likely • SPR mandates that papers deal with this problem • Greenhouse-Geisser epsilon adjustment • Degree of nonsphericity is computed • An adjustment factor, epsilon, is computed • New df computed by multiplying epsilon by original df • New df used for computing p-values • Greehouse-Geisser epsilon is overly conservative • Can use Huynh-Feldt epsilon instead • Everyone should use epsilon adjustment for all studies, not just ERP studies

More Related