250 likes | 495 Views
Multi-Media Retrieval. by Paul McGlade Modified by Shinta P. What is Multi-Media Retrieval?. The searching and retrieval of various multi-media (image, video, web). Typically consists of a query search against a database, usually called either digital libraries or digital archives.
E N D
Multi-Media Retrieval by Paul McGlade Modified by Shinta P.
What is Multi-Media Retrieval? • The searching and retrieval of various multi-media (image, video, web). • Typically consists of a query search against a database, usually called either digital libraries or digital archives. • Generally, multimedia databases also consist of textual data types.
Tasks • Multimedia systems must solve at least two different tasks: • First, relevant items have to be identified. • Second, they have to be presented in such a way that the user can relate them to each other, and what is often more complicated, to the query.
Problems • Multimedia data comparison is more difficult than textual data. • Different types of querying raises different types of problems. • The relevance of each aspect in the multimedia data must be weighted.
Approaches / Solutions • Different approaches are explored for the comparison process: • Text-based • Region-based • Object-based • Various solutions have been created: • Query formulation • MRML • Image indexing
Text-based • Index images using keywords or descriptions. • Advantages: • Easier to design and implement. • Uses surrounding text in a web page. • Disadvantages: • Often too expensive. • A picture can sometimes require many words. • Surrounding text may not describe picture.
Region-based • Queries images using regions of the image. • Advantages: • Handles low-level queries. • Many features can be extracted. • Disadvantages: • Cannot handle high-level queries.
Region-based Good Bad
Object-based • Extracts objects from images first. • Advantages: • Handles object-based queries. • Reduce feature storage adaptively. • Disadvantages: • Object segmentation is very difficult. • User interface is complicated and not easily implemented.
Blobworld • Blobworld is a system for content-based image retrieval. • By automatically segmenting each image into regions which roughly correspond to objects or parts of objects, we allow users to query for photographs based on the objects they contain. • Blobworld Site
Query Formulation • Formulates a query for comparison against a database. • Query Formula example: SIMILARITY: look similar OBJECT: contains a bike OBJECT RELATIONSHIP: contains a dog near a person MOOD: a happy picture TIME/PLACE: Yosemite sunset
MRML • Multimedia Retrieval Markup Language • MRML’s goal is to unify access to multimedia retrieval. • XML-based communication protocol. • Specified to standardize access to Multimedia Retrieval software components.
MRML (cont’d) • Code example: <property id = "p1" type = "subset" caption = "Weighting function" visibility = "visible" sendtype = "attribute" sendname = "cui-weighting-function" minsubsetsize = "1" maxsubsetsize = "1" > <property id = "p2" type = "setelement" caption = "Best fully weighted" visibility = "visible" sendtype = "value" sendvalue = "best-fully" defaultstate = "selected" /> <property id = "p3" type = "setelement" caption = "Classical IDF" visibility = "visible" sendtype = "value" sendvalue = "classical-idf" defaultstate = "unselected" /> </property> <mrml > <get-server-properties /> </mrml> <mrml > <get-algorithms collection-id = "collection-1" /> </mrml>
GIFT • GNU Image-Finding Tool is a Content Based Image Retrieval System (CBIRS). • Uses MRML. • Enables the user to query by example on images. • Relies purely on the content of the image. • GIFT Site
Image Indexing • Process which analyzes an image and selects aspects of the image to compare in order to index the image with little user input. • Segments the image into various regions, and attaches words to each region.
Image Indexing (cont’) Computer Predictions - male cloth female fashion environment people industry fire face man man-made Manual Category Annotation - super model people female cloth Computer Predictions - grass mare tiger horses cat buildings Manual Category Annotation - cat grass tiger
A-Lip • Automatic Linguistic Indexing of Pictures system selects among 600 trained concepts to annotate images automatically. • On-line real-time image annotation demonstration is expected to be developed and made available later this year. • When released, will be able to submit your own images for automatic annotation. • A-Lip Site
High-Level Tools • Some technical approaches to image comparison: • Wavelet comparisons. • Fast Image Segmentation. • IRM (Integrated Region Matching). • Fuzzy Matching.
SIMPLIcity • Semantics-sensitive Integrated Matching for Picture Libraries. • Combine low-level statistical semantic classification with image retrieval. • Wavelet-based feature extraction for fast segmentation. • Integrated Region Matching (IRM). • SIMPLIcity Site
Mengapa Image Retrieval Sulit? • Text Retrieval • Kata Adalah suatu unit, mudah diindex • Kata Memiliki arti semantik • Image Retrieval • Unit pberupa piksel, sulit diindex • Piksel tak memiliki arti • piksel membentuk pola representasi objek, kesulitan dalam segmentasi • Objek gambar tergantung banyak faktor
Mengapa Image Retrieval Sulit? (Cont’) • Image Retrieval • Objek gambar tergantung banyak faktor • Sudut Pandang • Iluminasi • Bayangan • Dan komplikasi lainya (latar belakang, variasi warna, dll)
Pencocokan Citra (Global Similarity) • Histogram Warna • Karakteristik Tekstur (region)
Pencocokan Citra (Local Similarity) • Query By Example • Segmentasi Objek • Pencocokan • Caption Text • Similarity (warna, tekstur, bentuk) • Susunan Spatial (orientasi, posisi) • Teknik Khhusus (eg. Pengenalan Wajah)
Conclusion • Since one query will return many false results, I believe more emphasis should be placed on the weighting of certain aspects of each image. • Some ideas: • Artistic tendencies could be taken into account when determining the relevance of an object in an image. • A textual comparison of an images indexed words, could help in determining how common certain objects are found together.