400 likes | 518 Views
Physics 411 Time Series Analysis Nov 7, 2013. Physics 411 Time Series Analysis Dr. Richard Dewey SEOS & Ocean Networks Canada rdewey@uvic.ca ftp:/canuck.seos.uvic.ca/Phys411/. Physics 411 Time Series Analysis Nov 7, 2013.
E N D
Physics 411 Time Series Analysis Nov 7, 2013 Physics 411 Time Series Analysis Dr. Richard Dewey SEOS & Ocean Networks Canada rdewey@uvic.ca ftp:/canuck.seos.uvic.ca/Phys411/
Physics 411 Time Series Analysis Nov 7, 2013 Rotary Spectra, Harmonic Analysis, and Intro. to Wavelets First, a quick review of Fourier and Power Spectrum techniques and issues. Parseval’s Theorem: The variance or average “power” of a signal can be estimated from the sum of the squared Fourier coefficients: where Xm is the complex amplitude at frequency fm=m/T=m∆f, ∆f=1/T, recalling For a discrete time series x(tn) at times tn, and
Physics 411 Time Series Analysis Nov 7, 2013 Then we can define a Fourier Power Spectrum as: And for the discrete time series, the discrete Power Spectrum, where fm= m∆f = m/T, for –N/2 ≤ m ≤ N/2-1. For some general time series routines: see ftp://canuck.seos.uvic.ca/matlab/rkd/ i.e.: ps.m, psplt.m, chisqp.m, cleanx.m, rotspec.m
Physics 411 Time Series Analysis Nov 7, 2013 Some key assumptions and issues in Fourier analysis: Assumes that for each frequency, the Fourier component amplitude is constant in time (at least over the segment) Assumes for each discrete frequency that the phase is fixed (These together relate to the assumption of stationarity, also that the “variance”, or generally “the characteristics” are constant in time) 3) The frequency limits are set by the sample rate and the duration of the time series, (or segments), they may have little significance 4) Leakage and stationarity issues are reduced/improved by windowing, segmenting, and averaging over-lapping segments 5) Can improve confidence by averaging in frequency domain 6) For FFT, data must be uniformly sampled (fixed sample rate), and N=2k (DFT can handle irregular, any-length data) So ∆t should be short, but limited by data acquisition, T (and/or segment length) should be long, but limited by stationarity/ temporal resolution of frequency content. Let’s look at an example time series…
Physics 411 Time Series Analysis Nov 7, 2013 function [pspec,f,coh,extra]=ps(x,dt,nfft,y) % [pspec,f,coh,extra]=ps(x,dt,nfft,y) calculates power (cross) spectral density % for the time series x (and optionally y), vectors only(!) % divided into 50% overlap segments, Hanning windowed. % dt passed for correct scaling and setting frequencies % in time units (seconds, hours, days: freq = Hz, cph, cpd) % Optional input parameters are: % nfft: length of fft segments (i.e. 512), % or pass nfft=1 if you want to routine to estimate nfft % y: (then output is cross spectra) % Output: pspec is the power spectrum vector % f is the frequency vector % Optional output parameters are: % coh = |Pxy|./sqrt(Pxx.*Pyy); % the corherencyfuntion if y is passed % structure: extra.psrms .M .W .nps .I1 .I2 .NENBW .ENBW % extra stuff % % Equal to MATLAB's psd*2*dt=psd/f_N [units^s/freq], now pwelch(x,hanning(M)); % Version 1.0 RKD 5/96, v2.0 2/97, v3.0 4/97, v3.1 4/12 % Good references are Bendat and Peirsol, and % the piece by Heinzel, Rudiger, and Schilling (GH_FFT.pdf) x=x(:)'; iy=0; if exist('y'), iy=1; y=y(:)'; else y=[]; end % only do cross stuff if y exists. if exist('nfft'), % check to see if user has set fft length M=nfft; % yes... else M=fix(length(x)/4.5); % no... then use 8 overlapping segments % this next block is old code, finding nearest 2^n length to optimize fft % for really long (1e8) time series, choose a nfft=2^n % n2=2.^(5:16); % maximum auto spectral length is 65536 % M=n2(max(find(n2<fix(2*T/3)))); % choose the longest M with at least 3 segments % if isempty(M), % if that didn't work, choose less segments % M=n2(max(find(n2<fix(T)))); % choose the longest M with 1 segments % end end if iy==1 & (length(y)/1.9) < M, disp(' You must segment dataset in order to get none-one/meaningful coherencies.'); end % chunk and interpolate NaN values in both real and imaginary ts first. if sum(isnan(x)) > 0, x=cleanx(x); end % cleanx is an RKD routine if iy==1, if sum(isnan(y)) > 0, y=cleanx(y); end; % uses cleanx.m routine end % now find new time series length T=length(x); if iy == 1, Ty=length(y); if Ty ~= T, disp(['Cleaned X and Y vectors not same length.']); return end end % calculate various spectral parameters nps=fix(2*T/M) - 1; % number of 50% overlapping sections/spectra if nps < 1, nps = 1; end if M > length(x) | nps == 1, M = length(x); end if rem(M,2), n2=(M+1)/2; % M is odd else n2=M/2; % M is even end window=hanning(M)'; % or pick some other window shape I1=sum(window); % the sum of the window weights (also S1) I2=(sum(window.^2)); % determine the power of this window (also S2) W=M/I2; % then this is the variance lost by windowing, aka coherent power gain NENBW=M*I2/(I1^2); % the normalized equivalent noise bandwidth ENBW=I2/(dt*I1^2); % the effective noise bandwidth pspec=(0+sqrt(-1)*0)*ones(1,n2); % initialize vector for +f only coh=pspec; PXX=pspec; PYY=pspec; % initialize coherency vectors % for jj=1:nps % loop for segment, fft, sum and average nst=fix(1+(jj-1)*M/2); nen=nst+M-1; indx=[nst:nen]; X=fft((window.*(detrend(x(indx)))),M); % this is complex FFT/DFT if iy==1, % this loop only if cross spectra/coh Y=fft((window.*(detrend(y(indx)))),M); % this is complex PS=Y.*conj(X); % calculate the cross spectra [see B&P (9.3)] else PS=X.*conj(X); % or just the auto-spectra, this is not complex end if iy == 1, % in addition, calculate Pxx and Pyy for coherency Pxx=X.*conj(X); Pyy=Y.*conj(Y); end pspec=pspec + PS(2:n2+1); % sum spectra not including f=0 if iy==1, PXX=PXX + Pxx(2:n2+1); % Must sum the spectra before calculating PYY=PYY + Pyy(2:n2+1); % the coherency function, otherwise coh=1.0 end end % if iy==1, coh = abs(pspec)./sqrt(PXX.*PYY); % calculate coherency function else coh = []; end % psrms=pspec*(2/(nps*I1^2)); % in case this is of interest pspec=pspec*(2*dt*W/(M*nps)); % scaled power spectra, also psrms/ENBW % if you want c.i.pspec*[plow,phi]=chisqp(95,nps) => a Dewey routine f=[(1:length(pspec))*(1/(M*dt))]'; % set frequency vector pspec=pspec(:);f=f(:); if nargout > 3, extra=struct('psrms',psrms(:),'M',M,'W',W,'nps',nps,'I1',I1,'I2',I2,'NENBW',NENBW,'ENBW',ENBW); end % fini
Physics 411 Time Series Analysis Nov 7, 2013 Rotary (vector) and Complex Spectra Recall for the discrete time series, the discrete Power Spectrum, where fm= m∆f = m/T, for –N/2 ≤ m ≤ N/2-1. If xn are real, then the spectrum is even about fo and X-m = Xm, and half the energy (or power) is in the negative frequency part of the spectrum. If we have N pieces of information going into the spectrum, we should expect N pieces of information coming out, but it looks like N/2. Once we take the absolute value to get the power spectrum, we loose the “phase” information of the Fourier Transform, and given only the power spectrum, we cannot get it back. (Therefore we cannot take the inverse Fourier transform of a PS to get the original time series back. If we keep the whole spectrum, +/-f, then we can.) If xn is “complex” (or is a vector), then there is both amplitude and phase information in the negative frequency part of the spectrum (it’s not symmetric).
Physics 411 Time Series Analysis Nov 7, 2013 Example: Lets write a vector in complex notation, z(tn) = un + ivn = r cosθ + i r sinθ = re-iθ Then Which is also complex But now the positive and negative frequencies refer to “vectors” rotating in either the anticlockwise (ACW) or clockwise (CW) directions respectively. If the amplitudes are different, then the sum of the counter rotating vectors traces an ellipse.
Physics 411 Time Series Analysis Nov 7, 2013 Harmonic Analysis: Tides as the example
Physics 411 Time Series Analysis Nov 7, 2013 What if we know something about the underlying processes driving/forcing the signal? We might know the frequencies, the phasing, and ratio of amplitudes, etc. Can we put some of the known characteristics into the Fourier analysis and ask it to solve for only the un-known parts? Let’s then assume that the problem is “over-determined”, such that we have more data values (information) than we are searching for. Then the solution is not unique, but we can find an optimal solution using least squares: optimal solution plus a residual. For tides, the “frequencies” are associated with known astronomical features. There are literally hundreds of primary, secondary, compound, etc. interactions that give rise to tidal forcing functions, each at a unique frequency. If we have thousands of measurements, the problem is still solvable in an optimal (least squares) sense.
Physics 411 Time Series Analysis Nov 7, 2013 For M possible harmonic components, the time series x(tn) can be expanded as where is the mean, is the residual portion unexplained by the harmonic analysis, and and are the harmonic constituent amplitudes, frequencies, and phases, where for tides, the frequencies are known. This can be re-written as where the unknown coefficients and relate to the amplitudes and phases by for q=1…M. Like Fourier analysis, the mean and linear trend, say , should be removed before the least squares analysis to reduce round-off errors. It can be added back later.
Physics 411 Time Series Analysis Nov 7, 2013 Wavelet Analysis Can there be further improvements to the classic Fourier Analysis? Most interesting time series violate some of the core assumptions of FA: they have changing frequency, phase, and/or variance with time. Can we “localize” in time the frequency information? Well, there is no free lunch, but there is no reason we can’t choose alternate base (but orthogonal) functions wavelets. Problem is, everyone has their favourite wavelet set.