1 / 67

Multimedia Signal Processing & Content-Based Image Retrieval

Multimedia Signal Processing & Content-Based Image Retrieval. Anastasios N. Venetsanopoulos University of Toronto Contact: anv@dsp.toronto.edu http://www.dsp.toronto.edu http://www.ece.toronto.edu. OUTLINE. INTRODUCTION MULTIMEDIA APPLICATIONS IMPACT OF MULTIMEDIA

Download Presentation

Multimedia Signal Processing & Content-Based Image Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multimedia Signal Processing & Content-Based Image Retrieval Anastasios N. Venetsanopoulos University of Toronto Contact: anv@dsp.toronto.edu http://www.dsp.toronto.edu http://www.ece.toronto.edu

  2. OUTLINE • INTRODUCTION • MULTIMEDIA APPLICATIONS • IMPACT OF MULTIMEDIA • CONTENT-BASED IMAGE RETRIEVAL (CBIR) • MPEG-7 • RESEARCH ISSUES

  3. INTRODUCTION-1 • WHAT IS MULTIMEDIA? • WHAT IS MULTIMEDIA PROCESSING? • GOALS OF MULTIMEDIA PROCESSING

  4. INTRODUCTION-2 WHAT IS MULTIMEDIA? • DIFFICULT TO DEFINE • GENERALLY CONSISTS OF: • MULTIMEDIA DATA • INTERACTION SET • MULTIMEDIA DATA: • MULTI-SOURCE, MULTI-TYPE, MULTI-FORMAT • INTERACTION SET: • WITHOUT INTERACTIONS BETWEEN MULTIMEDIA COMPONENTS, MULTIMEDIA IS MERELY A COLLECTION OF DATA

  5. INTRODUCTION-3 EXAMPLE: AUGMENTED REALITY CONFERENCE REAL OBJECTS VIRTUAL OBJECTS REAL SPEECH } Mutimedia Data Components COMPLEX INTERACTIONS BETWEEN COMPONENTS IN THE SCENE MAKE VIRTUAL COMPONENTS SEEM MORE REALISTIC

  6. INTRODUCTION-4 WHAT IS MULTIMEDIA PROCESSING? • MULTIMEDIA PROCESSING • APPLY SIGNAL PROCESSING TOOLS TO MULTIMEDIA DATA TO ENABLE: • REPRESENTATION • INTERPRETATION • ENCODING • DECODING

  7. INTRODUCTION-5 GOALS OF MULTIMEDIA PROCESSING • EFFECTIVE & EFFICIENT • ACCESS • MANIPULATION • EXCHANGE • STORAGE OF MULTIMEDIA CONTENT

  8. CONTINUING… • INTRODUCTION • MULTIMEDIA APPLICATIONS • IMPACT OF MULTIMEDIA • CONTENT-BASED IMAGE RETRIEVAL (CBIR) • MPEG-7 • RESEARCH ISSUES

  9. MULTIMEDIA APPLICATIONS-1 SCALABLE VIDEO STREAMING GPS NAVIGATION

  10. MULTIMEDIA APPLICATIONS-2 TELEPRESENCE CELLULAR E-COMMERCE

  11. MULTIMEDIA APPLICATIONS-3 • MORE SPECIFIC EXAMPLES • MULTIMEDIA APPLICATION GOALS • IMPROVE INTERPERSONAL COMMUNICATION • PROMOTE UNDERSTANDING OF IDEAS • ALLOW INTERACTIVITY WITH MEDIA • INCREASE ACCESSIBILITY TO DATA • MPEG-4, 7, 21 • JPEG-2000 • MP3 & PERCEPTUAL CODING • MULTIMEDIA STORAGE • VIDEO-ON-DEMAND • DIGITAL CINEMA • AUTHENTICATION

  12. GOING ON… • INTRODUCTION • MULTIMEDIA APPLICATIONS • IMPACT OF MULTIMEDIA • CONTENT-BASED IMAGE RETRIEVAL (CBIR) • MPEG-7 • RESEARCH ISSUES

  13. IMPACT OF MULTIMEDIA-4 WORLD INTERNET USAGE (July 23, 2005)

  14. IMPACT OF MULTIMEDIA-2 • USERS (S0CIETY) DEMAND • INCREASED MOBILITY • EASE-OF-USE • PERSONALCUSTOMIZATION • DEVICE FLEXIBILITY • HIGH LEVEL OF COLLABORATION WITH PEERS • DEVICES MUTATE AND BECOME • MULTI-FUNCTIONAL,NOT SPECIALIZED • EFFORTLESSLY PORTABLE, NOT STATIONARY • UBIQUITOUSLY NETWORKED, NOT ISOLATED

  15. IMPACT OF MULTIMEDIA-3 • MULTI-FUNCTIONAL DEVICES MUST • BROWSE INTERNET • ENTERTAIN • BE EASY-TO-USE • CUSTOMIZATION • PERSONALIZATION (THEMES, PREFERENCES) • NETWORKED • CAPABLE OF CONNECTING TO MANY DIFFERENT NETWORKS (INTERNET, P.O.T.S., LAN, CELLULAR, BLUETOOTH, 802.11b, GPS) • FACILITATE MANY TYPES OF WORKFLOW • MANAGE USER’S TIME

  16. IMPACT OF MULTIMEDIA-4 CONVERGENCE TECHNOLOGIES WHICH WERE TOTALLY UNRELATED 10 YEARS AGO ARE NOW UNIFIED UNDER THE CONCEPT OF MULTIMEDIA

  17. IMPACT OF MULTIMEDIA-5 • EXAMPLE: CELLULAR PHONES PRIMARY CONSUMER USE: • WIRELESS TELEPHONY CONVERGED USES • PERSONAL ORGANIZER • INTERNET BROWSER/EMAIL • ENTERTAINMENT (MP3, RADIO) • PAGER/MESSAGING (SMS) • VIDEO/STILL CAMERA

  18. IMPACT OF MULTIMEDIA-6 OVERALL • DEMANDS • FUNCTIONALITY • CONSUMPTION OF MANY MEDIA TYPES • CONNECTIVITY • PORTABILITY, ETC. • RESULT • HIGHLY COMPLEX DEVICES • PUSH TOWARDS DENSE CIRCUITRY • MULTIMEDIA DEVICES BECOME UBIQUITOUS • DEVICES GENERATE MULTIMEDIA DATA (INCLUDING IMAGES, VIDEO, AUDIO)

  19. MOVING ALONG… • INTRODUCTION • MULTIMEDIA APPLICATIONS • IMPACT OF MULTIMEDIA • CONTENT-BASED IMAGE RETRIEVAL (CBIR) • MPEG-7 • RESEARCH ISSUES

  20. CBIROVERVIEW • MOTIVATION & GOALS • WHAT IS CBIR? • CONTRIBUTING DISCIPLINES • APPLICATION SCENARIOS • SOME SPECIFIC ISSUES • TYPICAL CAPABILITIES

  21. CBIRMEDIA FLOODING EXAMPLE: GENERAL PHOTOGRAPHY • RESULT: DIGITAL MEDIA FLOOD • HOW DO WE COPE, TRACK, ORGANIZE IT ALL? • POLAROID FILED FOR BANKRUPTCY • HAS DIGITAL KILLED FILM? IF SO, WHY? • MEMORY REUSABLE • PRINTER TECHNOLOGY • SNAPSHOT PREVIEWS • EFFECTS & PROCESSING • CHEAP & DENSE STORAGE • EASY SHARING VIA INTERNET

  22. CBIRMOTIVATION • DEVICE FUNCTION CONVERGENCE • DATA RAPIDLY GENERATED BY MANY DEVICES • INTERNET ACTS AS GLOBAL TRANSPORT • DATA CONSUMED BY DEVICES ON DEMAND • MULTIMEDIA DATA NEEDS TO BE • EFFICIENTLY STORED • INDEXED ACCURATELY • EASILY RETRIEVED

  23. CBIRIS… • CONTENT BASED IMAGE RETRIEVAL • PART OF MULTIMEDIA INDEXING • IMAGES (2-D SPACE-DEPENDENT SIGNALS) • VIDEO (TIME-VARYING IMAGE SET) • AUDIO (1-D TIME-DEPENDENT SIGNALS) • TEXT (e.g. BOOK INDEX, SEARCH ENGINES) • COMPUTER BASED • HIGHLY AUTOMATED • DIFFICULT TO DO PROPERLY

  24. QUERY IMAGE RETRIEVAL RESULTS BASED ON COLOR CONTENT CBIRSIMPLE EXAMPLE • FOR A GIVEN QUERY… • EXAMPLE IMAGE • ROUGH SKETCH • EXPLICIT DESCRIPTION CRITERIA …RETURN ALL ‘SIMILAR’ IMAGES RETRIEVAL SYSTEM

  25. CBIRQUERY TYPES COLOR SKETCH SHAPE EXAMPLE MORE COMPLEX TYPES EXIST YET ABOVE ARE MOST FUNDAMENTAL & MOST REGULARLY USED TEXTURE

  26. CBIRCONTRIBUTORS • COMBINES HIGH-TECH ELEMENTS • MULTIMEDIA/SIGNAL/IMAGE PROCESSING • COMPUTER VISION/PATTERN RECOGNITION • COMPUTER SCIENCES (I.E. HUMAN-COMPUTER INTERACTION) • AND MORE TRADITIONAL CONCEPTS • PSYCHOLOGY/HUMAN PERCEPTION • INFORMATION SCIENCES (I.E. LIBRARY)

  27. CBIRSCENARIOS • SOME CBIR APPLICATION AREAS • MEDICAL IMAGING ART/CULTURAL HERITAGE • a • DESIGN/VISUAL ARTS • a ENTERTAINMENT (FILM, TV) • INDUSTRY (LOGO MANAGEMENT) • a GOVERNMENT (E.G. MUGSHOTS)

  28. CBIRVERSUS TEXT • IMPORTANT QUESTION ARISES: “WHY NOT SIMPLY INDEX USING TEXT?” (YAHOO! HAS HAD SOME SUCCESS WITH THIS) • INTUITIVE, YET USING TEXT IS • SIMPLE BUT SIMPLISTIC • TIME CONSUMING – CAN’T AUTOMATE • HIGHLY SUBJECTIVE & USER-DEPENDENT • SUSCEPTIBLE TO TRANSLATION PROBLEMS

  29. CBIRBASIC STRUCTURE • 3 BASIC FEATURES • COLOR, TEXTURE, SHAPE • MANY DESCRIPTORS • MPEG-7 IS ISO STANDARD • REALLY A DESIGN CHOICE • SIMILARITY • OPEN TO RESEARCH • LITTLE PERCEPTUAL CONSIDERATION FEATURE EXTRACTION FEATURE DESCRIPTIONS I N D E X SIMILARITYCALCULATION GENERATION OF RESULTS SIMILAR RESULTS USER INTERFACE QUERY

  30. CBIR(DIS)SIMILARITY? SIMILARITY IS NOT SO SIMPLE • ON WHAT BASIS ARE THEY SIMILAR? • COLOR CONTENT? • SHAPE CONTENT? • HIGH LEVEL IDEAS (‘MASKS’, ‘GENDER’)? • PERCEPTION IS ALWAYS AN ISSUE • CONSIDER THREE IMAGES

  31. CBIRSIMILARITY • DOMAIN [0,1] • CAN BE CALCULATED MANY WAYS • GENERALIZED MINKOWSKI • CANBERRA • PERCEPTUAL MEASURE

  32. CBIRTYPICAL ABILITIES • EFFECTIVE QUERIES IN • COLOR, TEXTURE, SHAPE • SIMPLE HYBRID QUERIES • DESCRIPTOR SUPERVECTORS • WEIGHTED AVERAGE OF (DIS)SIMILARITIES • RELEVANCE FEEDBACK • USER PLACED IN LOOP GIVES BETTER RESULTS • STATISTICAL APPROACHES • APPLY/ADJUST FEATURE WEIGHTS TO RELEVANT/IRRELEVANT ELEMENTS

  33. CBIRSUMMARY • BORN FROM MULTIMEDIA FLOOD • TEXT TOO SIMPLE AND LABORIOUS • SYSTEMS WORK DECENTLY IN VITRO • QUERY BY SHAPE, COLOR, TEXTURE, EXAMPLE • SHORTCOMINGS • NEED RELEVANCE FEEDBACK & PERCEPTUAL • HYBRID QUERIES DIFFICULT TO CREATE • SEMANTIC GAP NEEDS TO BE BRIDGED • MPEG-7: IMPORTANT DEVELOPMENT

  34. GOING FORWARD… • INTRODUCTION • MULTIMEDIA APPLICATIONS • IMPACT OF MULTIMEDIA • CONTENT-BASED IMAGE RETRIEVAL (CBIR) • MPEG-7 • RESEARCH ISSUES

  35. MPEG • MOTION PICTURES EXPERT GROUP • MPEG-1 • MPEG-2 • MPEG-4 • MPEG-7: ISO/IEC 15938 • MULTIMEDIA CONTENT DESCRIPTION INTERFACE • MPEG-21

  36. MPEG-1 & MPEG-2 • MPEG-1 (c. 1992) • BASIC VIDEO CODING USING DPCM & DCT • TARGET: CD-BASED VIDEO & MULTIMEDIA • USE I, B & P-FRAMES IN YUV SPACE • MPEG-2 (c. 1994) • SUPERSET OF MPEG-1 • GOAL: DTV/DSS OR ATM TRANSPORT • MINIMUM OF NTSC/PAL QUALITY • MORE ERROR RESILIENT • SCALABLE – GRACEFUL DEGRADATION

  37. MPEG-4 & MPEG-21 • MPEG-4 (c. 1998) • TOOLS TO AUTHOR MULTIMEDIA CONTENT • TRAFFIC AWARE, ERROR RESILIENT • OBJECT-BASED CODING • VERY EFFICIENT FOR LOW BIT-RATES • MPEG-21 (STARTED JUNE 2000) • AN OPEN “MULTIMEDIA FRAMEWORK” IDEA • ADDRESSES DIGITAL RIGHTS MANAGEMENT • ENHANCED DELIVERY & ACCESS OF DATA FOR DEVICES ON HETEROGENEOUS NETWORKS

  38. MPEG-7NEW PARADIGM • UNLIKE MPEG-1, MPEG-2, & MPEG-4 • DOESN’T REPRESENT CONTENT ITSELF • MPEG-7 ONLY DESCRIBES CONTENT • DIFFICULT CONCEPT FOR SOME TO GRASP • APPLICABLE TO • IMAGES • VIDEO • INDEPENDENT OF • STORAGE • ARCHITECTURE • AUDIO & SPEECH • TEXT • TRANSPORT • CODING

  39. MPEG-7HOW IT DIFFERS • MPEG-1 • TAKES INPUT FRAMES AND REPRESENTS AS AN BINARY ENCODED VIDEO BITSTREAM • MPEG-7 • TAKES VIDEO FRAMES (SAY MPEG-1 FORMAT) AND DESCRIBES CONTENTS OF EACH FRAME. 1001010110101001001110010110101011110... 1010010110101001001010101001010100110... 1111010110010100000101001111111110001... FRAME 1: COLOR CONTENT: 20% WHITE, 14% BLUE, SHAPES: BRIDGE, etc. FRAME 2: COLOR CONTENT: 20% WHITE, 15% BLUE, SHAPES: BRIDGE, etc. FRAME 3: COLOR CONTENT: 21% WHITE, 14% BLUE, SHAPES: BRIDGE, etc.

  40. MPEG-7SCOPE MULTIMEDIA DATA MPEG-7 SCOPE FEATURE EXTRACTION ALGORITHM CONTENT DESCRIPTION CODING SCHEME OTHER ELEMENTS . . .

  41. MPEG-7GOALS • DESCRIBE MULTIMEDIA CONTENT • SET OF DESCRIPTORS (D) • RELATIONS BETWEEN DESCRIPTORS • SET OF DESCRIPTION SCHEMES (DS) • LANGUAGE DEFINING D’s & DS’s • DESCRIPTION DEFINITION LANGUAGE (DDL) • BASED ON XML (eXtensible Markup Language) • USED TO BUILD UP NEW D’s & DS’s • ENCODING OF D’s FOR EFFICIENCY

  42. MPEG-7SUMMARY-1 • STANDARDIZED DESCRIPTIONS • APPLIES TO ALL DIGITAL MEDIA • CBIR IS CASE FOR STILL IMAGES • DOES NOT REPRESENT DATA ITSELF • DESCRIBES WHAT DATA REPRESENTS • SETS THE BAR FOR SYSTEMS • MULTIMEDIA/IMAGE RETRIEVAL SYSTEMS NEED AT LEAST MPEG-7 CONFORMANCE

  43. MPEG-7SUMMARY-2 • DOES NOT ADDRESS • SIMILARITY • RELEVANCE FEEDBACK • FEATURE EXTRACTION • HYBRID QUERY GENERATION • ARCHIVE ORGANIZATION • THE ABOVE ISSUES HAVE BEEN PURPOSEFULLY LEFT OPEN FOR INNOVATION

  44. FORGING AHEAD… • INTRODUCTION • MULTIMEDIA APPLICATIONS • IMPACT OF MULTIMEDIA • CONTENT-BASED IMAGE RETRIEVAL (CBIR) • MPEG-7 • RESEARCH ISSUES

  45. RESEARCH ISSUES • SHORTCOMINGS OF CBIR SYSTEMS • ONGOING RESEARCH • RELEVANCE FEEDBACK • HYBRID QUERY GENERATION • DISTRIBUTED MULTIMEDIA INDEXING • OPEN RESEARCH AVENUES

  46. CBIRSHORTCOMINGS-1 • COLOR • USUALLY GLOBAL • HIGH DIMENSIONALITY • GAMMA NONLINEARITIES CAUSE PROBLEMS • SHAPE • COMPLICATED & DIFFICULT • OCCLUSION ISSUES DURING EXTRACTION • TEXTURE • COMPLICATED & UNINTUITIVE • USER-SYSTEM RIFT FOR QUERY CREATION

  47. CBIRSHORTCOMINGS-2 • PERCEPTUAL ISSUES • SUBTLE DIFFERENCES BETWEEN VIEWERS • COLOR-BLIND USERS • SIMILARITY MEASURES • NEED TO BE TUNED TO DESCRIPTORS • e.g. EUCLIDEAN DISTANCE NOT APPLICABLE IN NON-EUCLIDEAN DESCRIPTION SPACE • RELEVANCE FEEDBACK • PERFORMED AT GLOBAL (IMAGE) LEVEL • NEED TO ADDRESS SPECIFIC IMAGE ELEMENTS

  48. ONGOING RESEARCH-2 RELEVANCE FEEDBACK • ITERATIVE QUERY REFINEMENT • PLACE USER IN LOOP TO ITERATIVELY IMPROVE RETRIEVAL RATES • HIGH-DIMENSIONAL SPACE NEEDS PRUNING • EMPHASIZED FEATURE(S) MUST BE FOUND • TYPICAL APPROACHES • STATISTICAL METHODS • FEATURE WEIGHTING

  49. ONGOING RESEARCH-2 RELEVANCE FEEDBACK • FEATURE SELECTIVE INTERFACE • WHY CHOOSE IMAGES ON WHOLE? REQUIRES PROCESSING/STATS TO FIND GOOD FEATURES • USER CAN EXPLICITLY INDICATE ELEMENTS OF IMAGE WHICH ARE GOOD: NO GUESSWORK RELEVANT SHAPE EXPLICIT FEATURES TO R.F. ENGINE RELEVANT COLOR

  50. ONGOING RESEARCH-3 SIMILARITY AGGREGATION/HYBRID QUERIES • TYPICALLY USED APPROACHES • BOOLEAN (AND, OR & NOT OPERATORS) • EUCLIDEAN (MINKOWSKI W/ r=1) • WEIGHTED AVERAGE (WA) i.e. SUPERVECTORS • DISADVANTAGES • EUCLIDEAN: FCN OF DESCRIPTORS – CHANGE DESCRIPTOR, DRASTICALLY ALTER MEASURE • WA: INFLEXIBLE FOR HIGH LEVEL QUERIES, SUPERVECTORS IMPOSE CERTAIN STRUCTURE • BOOLEAN: HARD LIMITED TO LOGIC FCNs • ALL LACK PERCEPTUAL CONSIDERATIONS

More Related