140 likes | 332 Views
Query-by-Singing/Humming 哼唱檢索. J.-S. Roger Jang ( 張智星 ) MIR Lab , CSIE Dept., National Taiwan Univ. http://mirlab.org/jang. Outline. Introduction to MIR & QBSH Components of QBSH Pitch Tracking Melody Comparison Methods Progressive Filtering Demos Conclusions. Introduction to MIR.
E N D
Query-by-Singing/Humming哼唱檢索 J.-S. Roger Jang (張智星) MIR Lab, CSIE Dept., National Taiwan Univ. http://mirlab.org/jang
Outline • Introduction to MIR & QBSH • Components of QBSH • Pitch Tracking • Melody Comparison Methods • Progressive Filtering • Demos • Conclusions
Introduction to MIR • Two meanings for MIR • Music Information Retrieval (音樂資訊檢索) We’ll stick to this one! • Multimedia Information Retrieval (多媒體資訊檢索)
Two Types of MIR Systems • Metadata-based • 歌名、歌手、歌詞、標記、作詞者、作曲者… • Query input: text or speech • Content-based • Emotion, genre, melody, chord, note onsets… • Query input: • Symbolic: 音符、和弦、文字… • Acoustic: 哼唱、口哨、敲擊、原音、beatboxing…
Types of Acoustic Inputs for MIR • 哼唱 • Query by humming (usually “ta” or “da”) • Query by singing • 口哨 • Query by whistling • 敲擊 • Query by tapping (at the onsets of notes) • 語音 • Query by the user’s speech input (for meta-data) • 原音音樂範例 • Query by example (noisy version of original clips) • Beatboxing
Types of Contents for Comparison • Melody • Query by humming (usually “ta” or “da”) • Query by singing • Query by whistling • Note onsets • Query by tapping (at the onsets of notes) • Metadata • Query by speech (for meta-data, such as title, artist, lyrics) • Audio contents • Query by examples (noisy versions of original clips) • Drums • Query by beatboxing
Introduction to QBSH • QBSH: Query by Singing/Humming • Input: Singing or humming from microphone • Output: A ranking list retrieved from the song database • Progression • First paper: Around 1994 • Extensive studies since 2001 • State of the art: QBSH tasks at ISMIR/MIREX, since 2006
Challenges in QBSH Systems • Reliable pitch tracking for acoustic input • Input from mobile devices or noisy karaoke bar • Song database preparation • MIDIs, singing clips, or audio music • Efficient/effective retrieval • Karaoke machine: ~10,000 songs • Internet music search engine: ~500,000,000 songs
Two Types of Processing for QBSH • 前處理: • 收集單軌標準答案(主旋律音高) • 標註比對點 • 辨識重複片段 • 即時處理: • 將使用者的音訊輸入轉成音高向量 • 由音高向量轉成音符(選擇性) • 和標準答案進行比對 • 列出排名
Flowchart of QBSH On-line processing Microphone input Filtering Pitch tracking Pitch vector smoothing Frame-based representation Similarity comparison Queryresults (Ranked songlist) Melody track extraction MIDI files Off-line processing
Short Latency and Strategies • Goal: To retrieve songs effectively within a given response time, say 5 seconds or so • Our strategies • Multi-stage progressive filtering • Indexing for different comparison methods • Repeating pattern identification
Outline • Introduction to MIR & QBSH • Components of QBSH • Pitch Tracking • Melody Comparison Methods • Progressive Filtering • Demos • Conclusions
Thank you for your attention. Questions and comments?