1.02k likes | 1.26k Views
Quantifying statistical interdependence of point processes Application to spike data and EEG. Analyzing Brain Signals by Combinatorial Optimization. Justin Dauwels LIDS, MIT Amari Research Unit, Brain Science Institute, RIKEN December 1, 2008. Topics. Mathematical problem
E N D
Quantifying statistical interdependence of point processes Application to spike data and EEG Analyzing Brain Signalsby Combinatorial Optimization Justin Dauwels LIDS, MIT Amari Research Unit, Brain Science Institute, RIKEN December 1, 2008
Topics • Mathematical problem • Similarity of Multiple Point Processes • Motivation/Application • Early diagnosis of Alzheimer’s disease from EEG signals • Along the way… • Spike synchrony Collaborators François Vialatte*, Theophane Weber+, and Andrzej Cichocki* (*RIKEN, +MIT) Financial Support
Alzheimer's disease One disease, many symptoms Evolution of the disease (stages) EEG data • 2 to 5 years before • mild cognitive impairment (MCI) • 6 to 25 % progress to Alzheimer‘s memory, language, executive functions, apraxia, apathy, agnosia, etc… • Mild (early stage) • becomes less energetic or spontaneous • noticeable cognitive deficits • still independent (able to compensate) Memory (forgetting relatives) • Moderate (middle stage) • Mental abilities decline • personality changes • become dependent on caregivers Apathy • Severe (late stage) • complete deterioration of the personality • loss of control over bodily functions • total dependence on caregivers Video sources: Alzheimer society • 2% to 5% of people over 65 years old • up to 20% of people over 80 • Jeong 2004 (Nature) GOAL: Diagnosis of MCI based on EEG • EEG is relatively simple and inexpensive technology • Early diagnosis: medication more effective, more time to prepare future care of patient, etc.
Overview • Alzheimer’s Disease (AD) decrease in EEG synchrony • Similarity of Point Processes • Two 1-D point processes • Two multi-D point processes • Multiple multi-D point processes • Numerical Results • Conclusion
Alzheimer's disease Inside glimpse: abnormal EEG EEG system: inexpensive, mobile, useful for screening Brain “slow-down” slow rhythms (0.5-8 Hz) fast rhythms (8-30 Hz) (Babiloni et al., 2004; Besthorn et al., 1997; Jelic et al. 1996, Jeong 2004; Dierks et al., 1993). focus of this project Decrease of synchrony • AD vs. MCI (Hogan et al. 203; Jiang et al., 2005) • AD vs. Control (Hermann, Demilrap, 2005, Yagyu et al. 1997; Stam et al., 2002; Babiloni et al. 2006) • MCI vs. mildAD (Babiloni et al., 2006). Images: www.cerebromente.org.br
Spontaneous (scalp) EEG Time-frequency |X(t,f)|2 (wavelet transform) f (Hz) Time-frequency patterns (“bumps”) Fourier |X(f)|2 Fourier power t (sec) amplitude EEG x(t)
Sparse representation: bump model f(Hz) f(Hz) Bumps Sparse representation t (sec) f(Hz) t (sec) 104- 105 coefficients t (sec) • Assumptions: • time-frequency map is suitable representation • oscillatory bursts (“bumps”) convey key information about 102 parameters F. Vialatte et al. “A machine learning approach to the analysis of time-frequency maps and its application to neural dynamics”, Neural Networks (2007).
Similarity of bump models How “similar”are n ≥ 2 bump models? Similarity of multiplemulti-dimensional point processes with and “point” / ”event”
Overview • Alzheimer’s Disease (AD) decrease in EEG synchrony • Similarity of Point Processes • Two 1-dim point processes • Two multi-dim point processes • Multiple multi-dim point processes • Numerical Results • Conclusion
Two one-dimensional point processes x t 0 x’ 0 t How synchronous/similar? Classical methods for continuous time series fail e.g., cross-correlation
Two aspects of synchrony • Analogy: waiting for a train • Train may not arrive (e.g., mechanical problem) • = Event reliability • Train may or may not be on time • = Timing precision
Two 1-dim point processes • Review of Spike Synchrony Measures • Surrogate Spike Data • Spike Trains from Morris-Lecar Neuron • Conclusion
Spike Synchrony Measures • Von Rossum distance (mixed) • Schreiber et al similarity measure (mixed) • Hunter-Milton similarity measure (mixed) • Victor-Purpura distance metric (event reliability) • Event synchronization (mixed) • Stochastic event synchrony (timing precision and event reliability)
Van Rossum distance measure • Spikes convolved with exponential or Gaussian function • → spike trains converted into time series s(t) and s’(t) • Squared distance between s(t) and s’(t) • If x = x’, we have DR = 0 • Time constant τR x 0 τR x’ 0 van Rossum M.C.W., 2001. A novel spike distance. Neural Computation 13, 751–63.
Schreiber et al. similarity measure • Spikes convolved with exponential or Gaussian function • → spike trains converted into time series s(t) and s’(t) • Correlation between s(t) and s’(t) • If x = x’, we have SS = 1 • Time constant τS Schreiber S., Fellous J.M., Whitmer J.H., Tiesinga P.H.E., and Sejnowski T.J., 2003. A new correlation-based measure of spike timing reliability. Neurocomputing 52, 925–931.
Victor-Purpura distance measure • Minimal cost DV of transforming x into x' • Basic operations • event insertion/deletion: cost = 1 • event movement: cost proportional to distance (constant CV) • If x = x’, we have DV = 0 • Time constant τV = 1/CV x DELETION 0 x’ 0 INSERTION Victor J. D. and Purpura K. P., 1997. Metric-space analysis of spike trains: theory, algorithms, and application. Network: Comput. Neural Systems 8(17), 127–164.
Stochastic Event Synchrony • x and x’ synchronous if identical apart from • delay • little timing jitter • few deletions/insertions • based on generative statistical model x 0 v 0 x’ 0 Dauwels J., Vialatte F., Rutkowski T., and Cichocki A., 2007. Measuring neural synchrony by message passing, NIPS 20, in press.
Stochastic Event Synchrony non-coincident x x T0 0 0 T0 v 0 T0 -δt /2 T0 x δt /2 0 x’ non-coincident 0 T0 Stochastic event synchrony (SES): delayδt, jitterst, non-coincidenceρ Dauwels J., Vialatte F., Rutkowski T., and Cichocki A., 2007. Measuring neural synchrony by message passing, NIPS 20, in press.
Stochastic Event Synchrony non-coincident x x T0 i.i.d. deletions with prob pd 0 Gaussian offsets with mean -δt /2 and variance st /2 0 T0 v geometric prior for lenght 0 T0 -δt /2 events i.u.d. in [0,T0] Gaussian offsets with mean δt /2 and variance st /2 T0 x δt /2 0 x’ i.i.d. deletions with prob pd non-coincident 0 T0 Marginalizing over v: Dauwels J., Vialatte F., Rutkowski T., and Cichocki A., 2007. Measuring neural synchrony by message passing, NIPS 20, in press.
Probabilistic inference PROBLEM: Given 2point processes x and x’, compute ρandθ = δt ,st APPROACH: (j*, j’*,θ*) = argmaxj,j’,θ log p(x, x’, j, j’,θ) SOLUTION: Coordinate descent (j(i+1), j’(i+1)) = argmaxj,j’ log p(x, x’, j , j’ , θ(i)) θ(i+1) = argmaxx log p(x, x’, j(i+1), j’(i+1) , θ) DYNAMIC PROGRAMMING PARAMETER ESTIMATION x’6 x’5 x’4 x’3 x’2 x’1 x’k’non-coincident xknon-coincident (xk x’k’ ) coincident pair 0 0 x1 x2 x3 x4 x5 x6
Spike Synchrony Measures • Von Rossum distance (mixed) • Schreiber et al similarity measure (mixed) • Hunter-Milton similarity measure (mixed) • Victor-Purpura distance metric (event reliability) • Event synchronization (mixed) • Stochastic event synchrony (timing precision and event reliability)
Two 1-dim point processes • Review of Spike Synchrony Measures • Surrogate Spike Data • Spike Trains from Morris-Lecar Neuron • Conclusion
Surrogate Data • pd= 0, 0.1, …, 0.4 (deletion probability) • δt = 0, 25, and 50 ms (delay) • σt= 10, 30, and 50 ms (timing jitter) • length of hidden sequence = 40/(1-pd) • expected length of x and x’ = 40 • E{S} computed over 10’000 pairs
Surrogate Data: Results δt =0 Van Rossum measure DR similar for SS ,SH ,SQ Victor Purpura measure DV • E{DR}increases with pd and σt • → DR cannot distinguish timing dispersion from event reliability • (likewise all measures except SES and DV) • E{DV}increases with pd,practically independent of σt • → DVmeasure for event reliability • ONLY curves for δt = 0ms, measures strongly depend on lag
Surrogate Data: Results for SES • E{σt}increases with σt,practically independent of pd • →σt measure for timing dispersion • E{ρ}increases with pd,practically independent of σt • → ρ measure for event reliability • Curves for δt = 0, 25, and 50 ms practically coincident
Two 1-dim point processes • Review of Spike Synchrony Measures • Surrogate Spike Data • Spike Trains from Morris-Lecar Neuron • Conclusion
Morris-Lecar Neurons • Simple neuron model • Exhibits behavior of Type I & II neurons (saddle-node/Hopf bifurc.) • Input current: baseline + sinusoid + Gaussian noise • Membrane potential Spiking threshold Type II Type I 5 trials
Morris-Lecar Neurons (2) Type II Type I 50 trials Low reliability Small timing dispersion High reliability Large timing dispersion jitterst = (3ms)2, non-coincidenceρ = 27% jitterst = (15ms)2, non-coincidenceρ = 3%
Morris-Lecar Neurons: Results • Smallτ: Type II has larger similarity than type I (dispersion in Type I) • Largeτ: Type I has larger similarity than type II (drop-outs in Type II) • Observation: • Similarity depends on time constant τ → similarity FUNCTION S(τ) • SES AUTOMATICALLY selects st
Two 1-dim point processes • Review of Spike Synchrony Measures • Surrogate Spike Data • Spike Trains from Morris-Lecar Neuron • Conclusion
Conclusion • Similarity of pairs of spike trains: timing precision and reliability • Comparison of various spike synchrony measures • Most measures not able to separate the two aspect of synchrony • Exception: Victor-Purpura and Stochastic Event Synchrony • Victor-Purpura: event reliability • SES: both timing precision and event reliability • Most measures depend on time constant, to be chosen by user • Exception: Event Synchronization and SES • Most measures sensitive to lags between the two spike trains • Exception: SES • Future work: application to neurophysiological recordings
Overview • Alzheimer’s Disease (AD) decrease in EEG synchrony • Similarity of Point Processes • Two 1-dim point processes • Two multi-dim point processes • Multiple multi-dim point processes • Numerical Results • Conclusion
... by matching bumps • Bumps in one model, but NOT in other • → fraction of “non-coincident” bumps ρ • Bumps in both models, but with offset • → Average time offset δt(delay) • → Timing jitter with variance st • → Average frequency offset δf • → Frequencyjitter with variance sf Stochastic Event Synchrony (SES) =(ρ,δt,st, δf, sf) PROBLEM: Given two bump models, compute (ρ,δt,st, δf, sf)
Generative model yhidden • Generate bump model (hidden) • geometric prior for number of bumps • bumps are uniformly distributed in rectangle • amplitude, width (in t and f) all i.i.d. • Generate two “noisy” observations • offset between hidden and observed bump • = Gaussian random vector with • mean ( ±δt /2, ±δf /2) • covariance diag(st/2, sf /2) • amplitude, width (in t and f) all i.i.d. • “deletion” with probability pd y y’ ( -δt /2, -δf/2) ( δt /2, δf/2) Dauwels J., Vialatte F., Rutkowski T., and Cichocki A., 2007. Measuring neural synchrony by message passing, NIPS 20, in press.
Summary PROBLEM: Given two bump models, compute (ρ,δt,st, δf, sf) θ APPROACH: (c*,θ*) = argmaxc,θ log p(y, y’, c, θ) SOLUTION: Coordinate descent c(i+1)= argmaxc log p(y, y’, c, θ(i)) θ(i+1) = argmaxx log p(y, y’, c(i+1),θ) MATCHING → max-product ESTIMATION → closed-form Dauwels J., Vialatte F., Rutkowski T., and Cichocki A., 2007. Measuring neural synchrony by message passing, NIPS 20, in press.
Average synchrony 3. SES for each pair of models 4. Average the SES parameters • Group electrodes in regions • Bump model for each region
Overview • Alzheimer’s Disease (AD) decrease in EEG synchrony • Similarity of Point Processes • Two 1-dim point processes • Two multi-dim point processes • Multiple multi-dim point processes • Numerical Results • Conclusion
Beyond pairwise interactions Multi-variate similarity Pairwise similarity
Similarity of multiple bump models y2 y1 y3 y4 y5 • Models similar if • few deletions/large clusters • little jitter y2 y1 y3 y4 y5 Constraint: in each cluster at most one bump from each signal Dauwels J., Vialatte F., Weber T. and Cichocki. Analyzing Brain Signals by Combinatorial Optimization, Allerton 2008.
Generative model yhidden • Generate bump model (hidden) • geometric prior for number n of bumps • bumps are uniformly distributed in rectangle • amplitude, width (in t and f) all i.i.d. y2 y1 y3 y4 y5 • Generate M“noisy” observations • offset between hidden and observed bump • = Gaussian random vector with • mean ( δt,m /2, δf,m /2) • covariance diag(st,m/2, sf,m /2) • amplitude, width (in t and f) all i.i.d. • “deletion” with probability pd pc(i) = p(cluster size = i |y) (i = 1,2,…,M) Parameters: θ = δt,m , δf,m , st,m , sf,m,pc Dauwels J., Vialatte F., Weber T. and Cichocki. Analyzing Brain Signals by Combinatorial Optimization, Allerton 2008.
Probabilistic inference PROBLEM: Given M bump models, compute θ = δt,m , δf,m , st,m , sf,m,pc APPROACH: (b*,θ*) = argmaxb,θ log p(y, y’, b, θ) SOLUTION: Coordinate descent b(i+1)= argmaxc log p(y, y’, b, θ(i)) θ(i+1) = argmaxx log p(y, y’, b(i+1),θ) CLUSTERING (Integer Program) ESTIMATION OF PARAMETERS • Integer programming methods (e.g., LP relaxation) • IP with 10.000 variables solved in about 1s • CPLEX: commercial toolbox for solving IPs (combines several algorithms) Dauwels J., Vialatte F., Weber T. and Cichocki. Analyzing Brain Signals by Combinatorial Optimization, Allerton 2008.
Overview • Alzheimer’s Disease (AD) decrease in EEG synchrony • Similarity of Point Processes • Two 1-dim point processes • Two multi-dim point processes • Multiple multi-dim point processes • Numerical Results • Conclusion
EEG Data • EEG of 22 Mild Cognitive Impairment (MCI) patients and 38 age-matched • control subjects (CTR) recorded while in rest with closed eyes • →spontaneous EEG • All 22 MCI patients suffered from Alzheimer’s disease (AD) later on • Electrodes located on 21 sites according to 10-20 international system • Electrodes grouped into 5 zones (reduces number of pairs) • 1 bump model per zone • Band pass filtered between 4 and 30 Hz EEG data provided by Prof. T. Musha
Similarity measures • Correlation and coherence • Granger causality (linear system): DTF, ffDTF, dDTF, PDC, PC, ... • Phase Synchrony: compareinstantaneous phases (wavelet/Hilbert transform) • State space based measures • sync likelihood, S-estimator, S-H-N-indices, ... • Information-theoretic measures • KL divergence, Jensen-Shannon divergence, ... FREQUENCY TIME No Phase Locking Phase Locking
Sensitivity (average synchrony) Corr/Coh Granger Info. Theor. State Space Phase SES Significant differences for ffDTF and SES (more unmatched bumps, but same amount of jitter) Mann-Whitney test: small p value suggests large difference in statistics of both groups
Classification (bi-SES) ± 85% correctly classified ffDTF • Clearseparation, but not yet useful as diagnostic tool • Additionalindicators needed (fMRI, MEG, DTI, ...) • Can be used for screening population (inexpensive, simple, fast)
Correlations Strong (anti-) correlations „families“ of sync measures
Overview • Alzheimer’s Disease (AD) decrease in EEG synchrony • Similarity of Point Processes • Two 1-dim point processes • Two multi-dim point processes • Multiple multi-dim point processes • Numerical Results • Conclusion
Conclusions • Measure for similarity of point processes • Key idea: matching of events • Applications • Spiking synchrony (surrogate data/Morris Lecar neuron) • EEG synchrony of MCI patients • SES allows to distinguish event reliability from timing precision • About 85-90% correctly classified MCI vs. healthy subjects perhaps useful for screening a large population • Future work: • Combination with other modalities (MEG, fMRI, ...) • Integration of biophysical models • Alternative inference techniques (variations on max-product, Monte-Carlo)