1 / 15

Finding Approximate Repeating Patterns from Sequence Data

Finding Approximate Repeating Patterns from Sequence Data. Jia-Lien Hsu ,Arbee L.P. Chen, Hung-Chen Chen. Proceeding : ISMIR2004. Speaker: Pei-Min Chou Date:2005/09/30. Introduction. Discover universal properties Repetitions and trends Application in music retrieval. Approximate type.

elle
Download Presentation

Finding Approximate Repeating Patterns from Sequence Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Finding Approximate Repeating Patterns from Sequence Data Jia-Lien Hsu ,Arbee L.P. Chen, Hung-Chen Chen Proceeding:ISMIR2004 Speaker: Pei-Min Chou Date:2005/09/30

  2. Introduction • Discover universal properties • Repetitions and trends • Application in music retrieval

  3. Approximate type • Longer_length • Shorter_length • Equal_length • Pattern may repeat with some variance • Example • ABC • ABKC match this paper

  4. Def-- Longer_length_match(P,LL) • P=(p1,p2,…,pm) • LL=(s1,s2,…,sn) • Longer_length_match(P,LL) • 1,if pi=sbi, for i=1,2…m. • 0,otherwise. • r=n-m:approximation degree • P=(A,B,C,D) LL=(A,B,K,C,M,D) • Longer_length_match(P,LL)=1,r=2 A,B,K,C,M,D

  5. Def--Freq(P,S,r,longer_length) • Freq(P,S,r,longer_length) =Σlonger_length_match(P,LLi) • LLi:substring of S • | LLi|=|P|+r • For any LLi=S[a…b] and LLj=S[c…d] longer_length_match(P,LLi)=1 longer_length_match(P,LLj)=1 either b<c or d<a Example: P=ABC S=AABCDEA XS=ABCCABCD V a b a b c dc d

  6. LL1 LL2 Example1 • P=(ABC) • S=(AKBCDEABLCF), consider r=1 • | LLi|=|P|+r=3+1=4 • AKBC DE ABLC F • Freq(ABC,S,1,longer_length)=2

  7. Definition • Pa_i:range of pattern_length • Pa_r:range of approximation degree pa_r={0,1,…max_pa_r} • Pa_f:minimal repeating frequency • AT( approximation type)

  8. Example2 • S=ABFCDLBMABPFCFD • Consider • Pa_i={1,2,3,4} • Pa_r={0,1} • Pa_f=2 • AT=longer_length • P1={“A”,”B”,”C”,”D”,”F”} • P2={“AB”,”BF”,”CD”,”FC”,”FD”} • P3={“ABF”,”BFC”,”FCD”} • P4={“ABFC”}

  9. Approach • Level-wise • Find approximate repeating patterns • Cut • Pattern_join

  10. Cut • Reduce the substring • Cuti=S[a…b], i=1,2,3….. • cw=max_pa_i+max_pa_r • a=1+(cw*(i-1)) • b=min((2*cw-1)+(cw*(i-1)),strlen(S)) • i:cut_id, strlen(S):length of S

  11. Example---cut • S=ABFCDLBMABPFCFD • Consider • max_pa_i=4,max_pa_r=1 • cw=4+1=5 • Cut1=“ABFCDLBMA” • a=1+(5*(1-1))=1, • b=min((2*5-1)+(5*(1-1)),15)=9 • Cut2=“LBMABPFCF” • Cut3=“PFCFD”

  12. Pattern_join • Pi={<pati(1),plisti(1)>,…,<pati(j),plisti(j)>} • i:pattern set of length • pati(j):j-th pattern in Pi • plisti(j) (cut_id:start,end) • Ex. Cut1=“ABFCDLBMA” Cut2=“LBMABPFCF” Cut3=“PFCFD” P2={<“BF”,(1:2,3),(2:5,7)>},freq=2 • Note:If start>cw=5 plisti(j) :dummy • P2={<“FC”,(1:3,4),(2:7,8),(3:2,3)>},freq=2 • ABFCDLBMABPFCFD

  13. Definition--- pattern_join • PJ(<pati(a),plisti(a)>,<pati(b),plisti(b)>)= • <pati+1(c),plisti+1(c)>,if pati(a)[2..i]= pati(b)[1..(i-1)] • Ø, otherwise • Example: • PJ(<“BF”,(1:2,3),(2:5,7)>,<“FD”,(1:3,5),(3:4,5)>) =<“BFD”,(1:2,5)>

  14. Example

  15. Conclusion • Complete • Longer_length approximation type • Level_wise approach • Preliminary investigation of performance study show our approach is efficient • Future work • Effectiveness of real data • Polyphonic music object • Short_length and equal_length study

More Related