280 likes | 301 Views
Optical Music Recognition. Ichiro Fujinaga McGill University 2007. Content. Optical Music Recognition Levy Project Levy Sheet Music Collection Digital Workflow Management Gamera. Optical Music Recognition (OMR). Trainable open-source OMR system in development since 1984
E N D
Optical Music Recognition Ichiro Fujinaga McGill University 2007
Content • Optical Music Recognition • Levy Project • Levy Sheet Music Collection • Digital Workflow Management • Gamera
Optical Music Recognition (OMR) • Trainable open-source OMR system in development since 1984 • Staff recognition and removal • Run-length coding • Projections • Lyric removal / classifier • Stems and notehead removal • Music symbol classifier • Score reconstruction Demo
OMR: Classifier • Connected-component analysis • Feature extraction, e.g: • Width, height, aspect ratio • Number of holes • Central moments • k-nearest neighbor classifier • Genetic algorithm
Overall Architecture for OMR Image File Staff removal Segmentation Recognition K-NN Classifier Output Symbol Name Optimization Genetic Algorithm K-nn Classifier Knowledge Base Feature Vectors Best Weight Vector Off-line
Lester S. Levy Collection • North American sheet music (1780–1960) • Digitized 29,000 pieces • including “The Star-Spangle Banner” and “Yankee Doodle” • Database of: • text index records • images of music (8bit gray) • lyrics (first lines of verse and chorus) • color images of cover sheets (32bit)http://levysheetmusic.mse.jhu.edu
Digital Workflow Management • Reduce the manual intervention for large-scale digitization projects • Creation of data repository (text, image, sound) • Optical Music Recognition (OMR) • Gamera • XML-based metadata • composer, lyricist, arranger, performer, artist, engraver, lithographer, dedicatee, and publisher • cross-references for various forms of names, pseudonyms • authoritative versions of names and subject terms • Music and lyric search engines • Analysis toolkit
The problem • Suitable OCR for lyrics not found • Commercial OCR systems are often inadequate for non-standard documents • The market for specialized recognition of historical documents is very small • Researchers performing document recognition often “re-invent” the basic image processing wheel
The solution • Provide easy to use tools to allow domain experts (people with specialized knowledge of a collection) to create custom recognition applications • Generalize OMR for structured documents
Introducing Gamera • Framework for creation of structured document recognition system • Designed for domain experts • Image processing tools (filters, binarizations, etc.) • Document segmentation and analysis • Symbol segmentation and classification • Feature extraction and selection • Classifier selection and combiners • Syntactical and semantic analysis Generalized Algorithms and Methods for Enhancement and Restoration of Archives
Features of Gamera • Portability (Unix, Windows, Mac) • Extensibility (Python and C++ plugins) • Easy-to-use (experts and programmers) • Open source • Graphic User Interface • Interactive / Batchable (scripts)
Scripting Environment (Python) Automatic Plugin Wrapper (Boost) Architecture of Gamera Graphic User Interface (wxWindows) Plugins (Python) Plugins (C++) GAMERA Core (C++)
Example of C++ Plugin // Number of pixels in matrix #include “gamera.hh” #ifdef __area_wrap__ #define NARGS 1 #define ARG1_ONEBIT #endif using namespace Gamera; template <class T> feature_t area(T &m) { return feature_t(m.nrows() * m.ncols()); }
Example of Python Plugin // This filters a list of CC objects import gamera def filter_wide(ccs, max_width): tmp = [] for x in ccs: if x.ncols() > max_width: x.fill_matrix(0) else: tmp.append(x) return tmp
Conclusions • Gamera allows rapid development of domain-specific document recognition applications • Domain experts can customize and control all aspects of the recognition process • Includes an easy-to-use interactive environment for experimentation • Available on Linux, OS X, and Windows
Projections X-projections Y-projections back