1 / 71

بنام خدا

بنام خدا. An Introduction to multi-way analysis. Mohsen Kompany-Zareh IASBS, Nov 1-3, 2010. Session one. The main source:. Kronecker product Khatri-Rao product Multi-way data Matricizing the data Interaction triad G PARAFAC Panel performance Matricizing and subarray Rank

king
Download Presentation

بنام خدا

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. بنام خدا

  2. An Introduction to multi-way analysis MohsenKompany-Zareh IASBS, Nov 1-3, 2010 Session one

  3. The main source:

  4. Kronecker product • Khatri-Rao product • Multi-way data • Matricizing the data • Interaction triad • G • PARAFAC • Panel performance • Matricizing and subarray • Rank • Dimensionality vector • Rank-deficiency in three-way arrays • Tucker3 rotational freedom • Unique solution • Tucker2 model • Tucker1 model

  5. kronecker product (A  B) >> A=[2 3 4; 2 3 4] >> B=[3 4; 3 5] >> krnAB=[A(1,1)*B A(1,2)*B A(1,3)*B ; A(2,1)*B A(2,2)*B A(2,3)*B] krnAB = 6 8 9 12 12 16 6 10 9 15 12 20 6 8 9 12 12 16 6 10 9 15 12 20 >>

  6. kronecker product >> A=[2 3 4; 2 3 4] >>B=[3 4; 3 5] >> p=kron(A,B) >>p= 6 8 9 12 12 16 6 10 9 15 12 20 6 8 9 12 12 16 6 10 9 15 12 20 >> All columns in A see all columns in B.

  7. kronecker product >> A=[2 3 4; 2 3 4] >>C=[3 4 5; 3 5 2] >>krnAC=[kron(A(:,),C(:,))... column 1 kron(A(:,1),C(:,2))... column 2 kron(A(:,1),C(:,3))... .. kron(A(:,2),C(:,1))... .. kron(A(:,),C(:,))... .. kron(A(:,2),C(:,3))... kron(A(:,3),C(:,1))... kron(A(:,3),C(:,2))... kron(A(:,),C(:,))] column 9 krnAC = 6 8 10 9 12 15 12 16 20 6 10 4 9 15 6 12 20 8 6 8 10 9 12 15 12 16 20 6 10 4 9 15 6 12 20 8 >> 1 1 2 2 3 3 Khatri-Rao Product

  8. kronecker product >> A=[2 3 4; 2 3 4] >>C=[3 4 5; 3 5 2] krnAC = 6 8 10 9 12 15 12 16 20 6 10 4 9 15 6 12 20 8 6 8 10 9 12 15 12 16 20 6 10 4 9 15 6 12 20 8 vec(a2b2) vec(a3b3) vec(a1 b1) vec(a1 b3) vec(a3b1) vec(a2b1) vec(a2b3) vec(a1 b2) vec(a3b2) Interaction terms

  9. Khatri-Rao Product >> A=[2 3 4; 2 3 4] >> B=[3 4 5; 3 5 2] khtrAB= 6 12 20 6 15 8 6 12 20 6 15 8 >> No of columns in A should be the same as the number of columns in B.

  10. Kronecker product • Khatri-Rao product • Multi-way data • Matricizing the data • Interaction triad • G • PARAFAC • Panel performance • Matricizing and subarray • Rank • Dimensionality vector • Rank-deficiency in three-way arrays • Tucker3 rotational freedom • Unique solution • Tucker2 model • Tucker1 model

  11. Multi-way Data (generalization of matrix algebra) A zero-order tensor: a scalar; a first-order tensor : a vector; a second-order tensor (a matrix) for a sample => 3 way data, for analysis a third-order tensor (three-way array) for a sample => 4 way data, for analysis a fourth-order tensor : a four-way array and so on.

  12. One component, HPLC-DAD a1 b1

  13. One component, HPLC-DAD, different concentrations (elution profile) Only the intensities are changed... These 9 matrices form a TRIAD, the simplest trilinear data

  14. A triad : X A cube of data 12x7x7 3rd order data for one sample >> a1' 0.0033 0.0971 0.8131 1.9506 1.3406 0.2640 0.0149 >> b1' 0.0222 1.7650 0.4060 0.8826 0.0111 0.0000 0.0000 >> c1' 1 2 3 4 5 6 7 8 9 10 11 12 Obtained from Tensor product of 3 vectors a1 b1 c1

  15. a1 b1 % A triad by outer product % X111=a1  b1  c1 ... for l=1:length(a1) for m=1:length(b1) for n=1:length(c1) disp([l m n]) Xtriad(l,m,n)=a1(l)*b1(m)*c1(n); end end end X=Xtriad; .... c1

  16. Matricizing the data X111= Unfold3D(X111, 1) (in three directions) The first chemical component

  17. ...and for the 2nd and the next chemical components: X111 = a1  b1  c1 + X222 = a2 b2  c2 + X333=a3b3c3 Each component in a separate triad (no interaction) X = X111 + X222 + X333 Trilinear PARAFAC

  18. Interaction triad In the presence of Interaction : X111 = a1  b1  c1 + X222 = a2 b2  c2 + X121=a1b2c1 X = X111 + X222 + X121 Non Trilinear!! Tucker

  19. G How many interaction triads? For two components in three modes: X111 = a1  b1  c1 G(111)= 2 X112 = a1  b1  c2 G(112)= 0 X121 = a1  b2  c1 G(121)= 1 X122 = a1  b2  c2 G(122)= 0 X211 = a2  b1  c1 G(211)= 0 X212 = a2  b1  c2 G(212)= 0 X221 = a2  b2  c1 G(221)= 0 X222 = a2  b2  c2 G(222)=-3 6 possible interaction triads 1 interaction triads

  20. B(1002) G(111)= 2 G(222)=-3 G(121)= 1 G(2x2x2) C(3x2) A(11x2)

  21. For three components in three modes: (3  3  3) – 3 = 24 possible interactions

  22. G(?x?x?) B(1003) C(20x2) A(15x4) How many G elements?

  23. % Tucker3outer product G=rand(4,3,2); for p=1:size(G,1) for q=1:size(G,2) for r=1:size(G,3) for i=1:size(A,2) for k=1:size(C,2) for m=1:size(B,2) disp([p q r i j k]) Xtriad(l,m,n)=A(i,l)*B(j,m)*C(k,n)*G(i,j,k); end end end X=X+Xtriad; end end end One triad

  24. What about Tucker4?

  25. % PARAFACouter product G=zeros(3,3,3); G(1,1,1)=1;G(2,2,2)=1;G(3,3,3)=1; for p=1:size(G,1) for q=1:size(G,2) for r=1:size(G,3) for i=1:size(A,2) for k=1:size(C,2) for m=1:size(B,2) disp([p q r i j k]) Xtriad(l,m,n)=A(i,l)*B(j,m)*C(k,n)*G(i,j,k); end end end X=X+Xtriad; end end end One triad

  26. B(1003) C(20x3) A(15x3) PARAFAC Simple interpretation

  27. Monitoring panel performance within and between experiments by multi-way models Rosaria Romano and MohsenKompany-Zareh Copenhagen Univ, 2007

  28. Organic Milk of high Quality Sensory studies 2007- University of Copenhagen Two different experiments were conducted in 2007: - Spring experiment (May, week 21 & 22) - Autumn experiment (September, week 36 & 37) The objective is to establish knowledge about production of high quality organic milk with a composition and flavour different from conventionally produced milk.

  29. Spring experiment data Data description: • 7 varieties of milk with respect to: - 2 cow races: Holstein-Fries (HF), Jersey (JE); - 7 farms: WB, EMC, UGJ, JP, HM, OA, KI. • panel: - 9 assessors, 2 sessions (focus on the second!), 3 replicates for each session. • 12 descriptors: odor (green), appearance (yellow), flavor (creamy, boiled-milk, sweet, bitter, metallic, sourness, stald-feed) after taste (astringent0, fatness, astringent20). • measurement scale: continuous scale anchored at 0 and 15.

  30. Parafac on the spring experiment(1) Model: Parafac with two components (27.9% ExpVar), on data averaged across the samples mode JE HF • high reproducibility of the replicates in both groups; • big variation in the JE group: • - WB is the less yellow JE milk; • UGJ seems have something in common with HF group.

  31. Parafac on the spring experiment(2) Model: Parafac with two components (27.9% ExpVar), on data averaged across the samples mode Best Reliability on Multi-way Assessment (Bro and Romano, 2008)

  32. Kronecker product • Khatri-Rao product • Multi-way data • Matricizing the data • Interaction triad • G • PARAFAC • Panel performance • Matricizing and subarray • Rank • Dimensionality vector • Rank-deficiency in three-way arrays • Tucker3 rotational freedom • Unique solution • Tucker2 model • Tucker1 model

  33. Rank • A has full rank (if and only if ) : r(A) = min(I,J). • If r(A )= R, [Schott 1997] • A = t1p1 + ·· ·+tRpR • R rank one matrices • (trpr , components). Bases are not unique: rotational freedom intensity (or scale) indeterminacy. sign indeterminacy.

  34. If X (I × J ) :generated with I × J random numbers • =>probability of (X has less than full rank) =0 • .. • => measured data sets in chemistry: • always full rank (mathematical rank) <= measurment • noise • Ex: UV spectra (100 wavelengths) ; • ten different samples, • each: same absorbing species at different concentrations. • X (10 ×100) • if Lambert–Beer law holds : rank one. + measurement errors => mathem rank = ten.

  35. X = cs’ + E = Xhat + E(model of X) • vector c : concns, • s : pure UV spectrum of the abs species • E : noise part. • 1. systematic variation 2. Noise(undesirable) • pseudo-rank =Math rank (Xhat) = one • < math rank (X). • ‘chemical rank’ : number of chemical sources of variation in data.

  36. Rank deficiency  pseudo-rank < chemical rank. ( linear relations in or restrictions on the data). Ex; X = c1s1 + c2s2 + c3s3 + E , s1 = s2 (linear relation) => X = (c1 + c2)s1 + c3s3 + E Chem rank (X)= 3 pseudo-rank (X)= 2, rank deficient

  37. A randomly generated 2 × 2 × 2 array to have a rank lower than three : a positive probability [Kruskal 1989]. a probability of 0.79 of obtaining a rank two array a probability of 0.21 of obtaining a rank three . probability of obtaining rank one or lower is zero. generalized to : 2 × n × n arrays [Ten Berge 1991].

  38. 2 × 2 × 2 array: the maximum rank: three typical rank: {2, 3}, (almost all individual rank: very hard to establish. Three way rank : important in second-order calibration and curve resolution. for degrees of freedom ?? for significance testing.

  39. Matricizing Matricizing and Sub-arrays X(4 × 3 × 2) Boldfaces : in the foremost frontal slice

  40. sub-arrays

  41. Dimensionality vector Row-rank, column-rank, tube-rank • two-way X : rank(X) = rank(X’) column rank= row rank • :not hold for three-way arrays. • three-way array X(I × J × K) : matricized in three different ways • (i) row-wise, giving X(J ×IK), a two-way array • (ii) column-wise, giving X(I×JK) , • tube-wise, giving X(K×IJ). • and three more with the same ranks,not mentioned • ranks of the arrays X(J×IK),X(I×JK) and X(K×IJ), • = (P, Q, R): dimensionality vector of X.

  42. P, Q and R: not necessarily equal. In contrast with two-way P = Q = r(X). dimensionality vector (P, Q, R) of a three-way array X with rank S Obeys certain inequalities[Kruskal 1989]: (i) P ≤ QR ; Q ≤ PR; R ≤ PQ (ii) max(P, Q, R) ≤ S ≤ min(PQ, QR, PR)

  43. Three matricized forms: These arrays have rank 4, 3, and 2. Dimensionality vector is [4 3 2] P, Q and R can be unequal.

  44. Pseudo-rank, rank deficiency and chemical sources of variation pseudo-rank of three-way arrays: straight generalization of the two-way definit. X = Xhat + E E : array of residuals. pseudo-rank of X= minimum # PARAFAC components necessary to exactly fit Xhat.

  45. Rank-deficiency in three-way arrays Spectrophometric acid-base titration of mixtures of three weak mono-protic acids (or Flow injection analysis + pH gradient) HA2 H+ + A2- HA3 H+ + A3- HA4 H+ + A4- six components models of separate titration of the three analytes(HA2, HA3, HA4), XHA2 = ca,2sa,2 + cb,2sb,2 + EHA2 XHA3 = ca,3sa,3 + cb,3sb,3 + EHA3 XHA4 = ca,4sa,4 + cb,4sb,4 + EHA4 10 samples, 15 titn points, and 20 wavel.s => X(10×15×20),

  46. X = Xhat + E • ca,2 + cb,2 = α(ca,3 + cb,3) = β(ca,4 + cb,4) • only four independently varying concn profiles. • Pseudo-rank (X(IJK)) = four. • pseudo-rank (X(3 × JK)) =three. • six different ultraviolet spectra form, • pseudo-rank (X(6 × KI)) =six • ==>> a Tucker3 (6,4,3) model is needed to fit X.

  47. 3  6  4 = 72 nonzero elements !! • Inequality laws: • (i) P ≤ QR ; Q ≤ PR; R ≤ PQ • max(3, 6, 4) ≤ S ≤ min(PQ, QR, PR) • 6 ≤ S ≤ 12

  48. three-way rank of X is ≥ 6 (six PARAFAC components fit the data) Pseudo rank (S=6) is not less than chemical rank(6) => no three-way rank deficiency. rank deficiencies in one loading matrix of a three-way array are not the same as a three-way rank deficiency.

More Related