670 likes | 932 Views
Multimedia Signal Processing & Content-Based Image Retrieval. Anastasios N. Venetsanopoulos University of Toronto Contact: anv@dsp.toronto.edu http://www.dsp.toronto.edu http://www.ece.toronto.edu. OUTLINE. INTRODUCTION MULTIMEDIA APPLICATIONS IMPACT OF MULTIMEDIA
E N D
Multimedia Signal Processing & Content-Based Image Retrieval Anastasios N. Venetsanopoulos University of Toronto Contact: anv@dsp.toronto.edu http://www.dsp.toronto.edu http://www.ece.toronto.edu
OUTLINE • INTRODUCTION • MULTIMEDIA APPLICATIONS • IMPACT OF MULTIMEDIA • CONTENT-BASED IMAGE RETRIEVAL (CBIR) • MPEG-7 • RESEARCH ISSUES
INTRODUCTION-1 • WHAT IS MULTIMEDIA? • WHAT IS MULTIMEDIA PROCESSING? • GOALS OF MULTIMEDIA PROCESSING
INTRODUCTION-2 WHAT IS MULTIMEDIA? • DIFFICULT TO DEFINE • GENERALLY CONSISTS OF: • MULTIMEDIA DATA • INTERACTION SET • MULTIMEDIA DATA: • MULTI-SOURCE, MULTI-TYPE, MULTI-FORMAT • INTERACTION SET: • WITHOUT INTERACTIONS BETWEEN MULTIMEDIA COMPONENTS, MULTIMEDIA IS MERELY A COLLECTION OF DATA
INTRODUCTION-3 EXAMPLE: AUGMENTED REALITY CONFERENCE REAL OBJECTS VIRTUAL OBJECTS REAL SPEECH } Mutimedia Data Components COMPLEX INTERACTIONS BETWEEN COMPONENTS IN THE SCENE MAKE VIRTUAL COMPONENTS SEEM MORE REALISTIC
INTRODUCTION-4 WHAT IS MULTIMEDIA PROCESSING? • MULTIMEDIA PROCESSING • APPLY SIGNAL PROCESSING TOOLS TO MULTIMEDIA DATA TO ENABLE: • REPRESENTATION • INTERPRETATION • ENCODING • DECODING
INTRODUCTION-5 GOALS OF MULTIMEDIA PROCESSING • EFFECTIVE & EFFICIENT • ACCESS • MANIPULATION • EXCHANGE • STORAGE OF MULTIMEDIA CONTENT
CONTINUING… • INTRODUCTION • MULTIMEDIA APPLICATIONS • IMPACT OF MULTIMEDIA • CONTENT-BASED IMAGE RETRIEVAL (CBIR) • MPEG-7 • RESEARCH ISSUES
MULTIMEDIA APPLICATIONS-1 SCALABLE VIDEO STREAMING GPS NAVIGATION
MULTIMEDIA APPLICATIONS-2 TELEPRESENCE CELLULAR E-COMMERCE
MULTIMEDIA APPLICATIONS-3 • MORE SPECIFIC EXAMPLES • MULTIMEDIA APPLICATION GOALS • IMPROVE INTERPERSONAL COMMUNICATION • PROMOTE UNDERSTANDING OF IDEAS • ALLOW INTERACTIVITY WITH MEDIA • INCREASE ACCESSIBILITY TO DATA • MPEG-4, 7, 21 • JPEG-2000 • MP3 & PERCEPTUAL CODING • MULTIMEDIA STORAGE • VIDEO-ON-DEMAND • DIGITAL CINEMA • AUTHENTICATION
GOING ON… • INTRODUCTION • MULTIMEDIA APPLICATIONS • IMPACT OF MULTIMEDIA • CONTENT-BASED IMAGE RETRIEVAL (CBIR) • MPEG-7 • RESEARCH ISSUES
IMPACT OF MULTIMEDIA-4 WORLD INTERNET USAGE (July 23, 2005)
IMPACT OF MULTIMEDIA-2 • USERS (S0CIETY) DEMAND • INCREASED MOBILITY • EASE-OF-USE • PERSONALCUSTOMIZATION • DEVICE FLEXIBILITY • HIGH LEVEL OF COLLABORATION WITH PEERS • DEVICES MUTATE AND BECOME • MULTI-FUNCTIONAL,NOT SPECIALIZED • EFFORTLESSLY PORTABLE, NOT STATIONARY • UBIQUITOUSLY NETWORKED, NOT ISOLATED
IMPACT OF MULTIMEDIA-3 • MULTI-FUNCTIONAL DEVICES MUST • BROWSE INTERNET • ENTERTAIN • BE EASY-TO-USE • CUSTOMIZATION • PERSONALIZATION (THEMES, PREFERENCES) • NETWORKED • CAPABLE OF CONNECTING TO MANY DIFFERENT NETWORKS (INTERNET, P.O.T.S., LAN, CELLULAR, BLUETOOTH, 802.11b, GPS) • FACILITATE MANY TYPES OF WORKFLOW • MANAGE USER’S TIME
IMPACT OF MULTIMEDIA-4 CONVERGENCE TECHNOLOGIES WHICH WERE TOTALLY UNRELATED 10 YEARS AGO ARE NOW UNIFIED UNDER THE CONCEPT OF MULTIMEDIA
IMPACT OF MULTIMEDIA-5 • EXAMPLE: CELLULAR PHONES PRIMARY CONSUMER USE: • WIRELESS TELEPHONY CONVERGED USES • PERSONAL ORGANIZER • INTERNET BROWSER/EMAIL • ENTERTAINMENT (MP3, RADIO) • PAGER/MESSAGING (SMS) • VIDEO/STILL CAMERA
IMPACT OF MULTIMEDIA-6 OVERALL • DEMANDS • FUNCTIONALITY • CONSUMPTION OF MANY MEDIA TYPES • CONNECTIVITY • PORTABILITY, ETC. • RESULT • HIGHLY COMPLEX DEVICES • PUSH TOWARDS DENSE CIRCUITRY • MULTIMEDIA DEVICES BECOME UBIQUITOUS • DEVICES GENERATE MULTIMEDIA DATA (INCLUDING IMAGES, VIDEO, AUDIO)
MOVING ALONG… • INTRODUCTION • MULTIMEDIA APPLICATIONS • IMPACT OF MULTIMEDIA • CONTENT-BASED IMAGE RETRIEVAL (CBIR) • MPEG-7 • RESEARCH ISSUES
CBIROVERVIEW • MOTIVATION & GOALS • WHAT IS CBIR? • CONTRIBUTING DISCIPLINES • APPLICATION SCENARIOS • SOME SPECIFIC ISSUES • TYPICAL CAPABILITIES
CBIRMEDIA FLOODING EXAMPLE: GENERAL PHOTOGRAPHY • RESULT: DIGITAL MEDIA FLOOD • HOW DO WE COPE, TRACK, ORGANIZE IT ALL? • POLAROID FILED FOR BANKRUPTCY • HAS DIGITAL KILLED FILM? IF SO, WHY? • MEMORY REUSABLE • PRINTER TECHNOLOGY • SNAPSHOT PREVIEWS • EFFECTS & PROCESSING • CHEAP & DENSE STORAGE • EASY SHARING VIA INTERNET
CBIRMOTIVATION • DEVICE FUNCTION CONVERGENCE • DATA RAPIDLY GENERATED BY MANY DEVICES • INTERNET ACTS AS GLOBAL TRANSPORT • DATA CONSUMED BY DEVICES ON DEMAND • MULTIMEDIA DATA NEEDS TO BE • EFFICIENTLY STORED • INDEXED ACCURATELY • EASILY RETRIEVED
CBIRIS… • CONTENT BASED IMAGE RETRIEVAL • PART OF MULTIMEDIA INDEXING • IMAGES (2-D SPACE-DEPENDENT SIGNALS) • VIDEO (TIME-VARYING IMAGE SET) • AUDIO (1-D TIME-DEPENDENT SIGNALS) • TEXT (e.g. BOOK INDEX, SEARCH ENGINES) • COMPUTER BASED • HIGHLY AUTOMATED • DIFFICULT TO DO PROPERLY
QUERY IMAGE RETRIEVAL RESULTS BASED ON COLOR CONTENT CBIRSIMPLE EXAMPLE • FOR A GIVEN QUERY… • EXAMPLE IMAGE • ROUGH SKETCH • EXPLICIT DESCRIPTION CRITERIA …RETURN ALL ‘SIMILAR’ IMAGES RETRIEVAL SYSTEM
CBIRQUERY TYPES COLOR SKETCH SHAPE EXAMPLE MORE COMPLEX TYPES EXIST YET ABOVE ARE MOST FUNDAMENTAL & MOST REGULARLY USED TEXTURE
CBIRCONTRIBUTORS • COMBINES HIGH-TECH ELEMENTS • MULTIMEDIA/SIGNAL/IMAGE PROCESSING • COMPUTER VISION/PATTERN RECOGNITION • COMPUTER SCIENCES (I.E. HUMAN-COMPUTER INTERACTION) • AND MORE TRADITIONAL CONCEPTS • PSYCHOLOGY/HUMAN PERCEPTION • INFORMATION SCIENCES (I.E. LIBRARY)
CBIRSCENARIOS • SOME CBIR APPLICATION AREAS • MEDICAL IMAGING ART/CULTURAL HERITAGE • a • DESIGN/VISUAL ARTS • a ENTERTAINMENT (FILM, TV) • INDUSTRY (LOGO MANAGEMENT) • a GOVERNMENT (E.G. MUGSHOTS)
CBIRVERSUS TEXT • IMPORTANT QUESTION ARISES: “WHY NOT SIMPLY INDEX USING TEXT?” (YAHOO! HAS HAD SOME SUCCESS WITH THIS) • INTUITIVE, YET USING TEXT IS • SIMPLE BUT SIMPLISTIC • TIME CONSUMING – CAN’T AUTOMATE • HIGHLY SUBJECTIVE & USER-DEPENDENT • SUSCEPTIBLE TO TRANSLATION PROBLEMS
CBIRBASIC STRUCTURE • 3 BASIC FEATURES • COLOR, TEXTURE, SHAPE • MANY DESCRIPTORS • MPEG-7 IS ISO STANDARD • REALLY A DESIGN CHOICE • SIMILARITY • OPEN TO RESEARCH • LITTLE PERCEPTUAL CONSIDERATION FEATURE EXTRACTION FEATURE DESCRIPTIONS I N D E X SIMILARITYCALCULATION GENERATION OF RESULTS SIMILAR RESULTS USER INTERFACE QUERY
CBIR(DIS)SIMILARITY? SIMILARITY IS NOT SO SIMPLE • ON WHAT BASIS ARE THEY SIMILAR? • COLOR CONTENT? • SHAPE CONTENT? • HIGH LEVEL IDEAS (‘MASKS’, ‘GENDER’)? • PERCEPTION IS ALWAYS AN ISSUE • CONSIDER THREE IMAGES
CBIRSIMILARITY • DOMAIN [0,1] • CAN BE CALCULATED MANY WAYS • GENERALIZED MINKOWSKI • CANBERRA • PERCEPTUAL MEASURE
CBIRTYPICAL ABILITIES • EFFECTIVE QUERIES IN • COLOR, TEXTURE, SHAPE • SIMPLE HYBRID QUERIES • DESCRIPTOR SUPERVECTORS • WEIGHTED AVERAGE OF (DIS)SIMILARITIES • RELEVANCE FEEDBACK • USER PLACED IN LOOP GIVES BETTER RESULTS • STATISTICAL APPROACHES • APPLY/ADJUST FEATURE WEIGHTS TO RELEVANT/IRRELEVANT ELEMENTS
CBIRSUMMARY • BORN FROM MULTIMEDIA FLOOD • TEXT TOO SIMPLE AND LABORIOUS • SYSTEMS WORK DECENTLY IN VITRO • QUERY BY SHAPE, COLOR, TEXTURE, EXAMPLE • SHORTCOMINGS • NEED RELEVANCE FEEDBACK & PERCEPTUAL • HYBRID QUERIES DIFFICULT TO CREATE • SEMANTIC GAP NEEDS TO BE BRIDGED • MPEG-7: IMPORTANT DEVELOPMENT
GOING FORWARD… • INTRODUCTION • MULTIMEDIA APPLICATIONS • IMPACT OF MULTIMEDIA • CONTENT-BASED IMAGE RETRIEVAL (CBIR) • MPEG-7 • RESEARCH ISSUES
MPEG • MOTION PICTURES EXPERT GROUP • MPEG-1 • MPEG-2 • MPEG-4 • MPEG-7: ISO/IEC 15938 • MULTIMEDIA CONTENT DESCRIPTION INTERFACE • MPEG-21
MPEG-1 & MPEG-2 • MPEG-1 (c. 1992) • BASIC VIDEO CODING USING DPCM & DCT • TARGET: CD-BASED VIDEO & MULTIMEDIA • USE I, B & P-FRAMES IN YUV SPACE • MPEG-2 (c. 1994) • SUPERSET OF MPEG-1 • GOAL: DTV/DSS OR ATM TRANSPORT • MINIMUM OF NTSC/PAL QUALITY • MORE ERROR RESILIENT • SCALABLE – GRACEFUL DEGRADATION
MPEG-4 & MPEG-21 • MPEG-4 (c. 1998) • TOOLS TO AUTHOR MULTIMEDIA CONTENT • TRAFFIC AWARE, ERROR RESILIENT • OBJECT-BASED CODING • VERY EFFICIENT FOR LOW BIT-RATES • MPEG-21 (STARTED JUNE 2000) • AN OPEN “MULTIMEDIA FRAMEWORK” IDEA • ADDRESSES DIGITAL RIGHTS MANAGEMENT • ENHANCED DELIVERY & ACCESS OF DATA FOR DEVICES ON HETEROGENEOUS NETWORKS
MPEG-7NEW PARADIGM • UNLIKE MPEG-1, MPEG-2, & MPEG-4 • DOESN’T REPRESENT CONTENT ITSELF • MPEG-7 ONLY DESCRIBES CONTENT • DIFFICULT CONCEPT FOR SOME TO GRASP • APPLICABLE TO • IMAGES • VIDEO • INDEPENDENT OF • STORAGE • ARCHITECTURE • AUDIO & SPEECH • TEXT • TRANSPORT • CODING
MPEG-7HOW IT DIFFERS • MPEG-1 • TAKES INPUT FRAMES AND REPRESENTS AS AN BINARY ENCODED VIDEO BITSTREAM • MPEG-7 • TAKES VIDEO FRAMES (SAY MPEG-1 FORMAT) AND DESCRIBES CONTENTS OF EACH FRAME. 1001010110101001001110010110101011110... 1010010110101001001010101001010100110... 1111010110010100000101001111111110001... FRAME 1: COLOR CONTENT: 20% WHITE, 14% BLUE, SHAPES: BRIDGE, etc. FRAME 2: COLOR CONTENT: 20% WHITE, 15% BLUE, SHAPES: BRIDGE, etc. FRAME 3: COLOR CONTENT: 21% WHITE, 14% BLUE, SHAPES: BRIDGE, etc.
MPEG-7SCOPE MULTIMEDIA DATA MPEG-7 SCOPE FEATURE EXTRACTION ALGORITHM CONTENT DESCRIPTION CODING SCHEME OTHER ELEMENTS . . .
MPEG-7GOALS • DESCRIBE MULTIMEDIA CONTENT • SET OF DESCRIPTORS (D) • RELATIONS BETWEEN DESCRIPTORS • SET OF DESCRIPTION SCHEMES (DS) • LANGUAGE DEFINING D’s & DS’s • DESCRIPTION DEFINITION LANGUAGE (DDL) • BASED ON XML (eXtensible Markup Language) • USED TO BUILD UP NEW D’s & DS’s • ENCODING OF D’s FOR EFFICIENCY
MPEG-7SUMMARY-1 • STANDARDIZED DESCRIPTIONS • APPLIES TO ALL DIGITAL MEDIA • CBIR IS CASE FOR STILL IMAGES • DOES NOT REPRESENT DATA ITSELF • DESCRIBES WHAT DATA REPRESENTS • SETS THE BAR FOR SYSTEMS • MULTIMEDIA/IMAGE RETRIEVAL SYSTEMS NEED AT LEAST MPEG-7 CONFORMANCE
MPEG-7SUMMARY-2 • DOES NOT ADDRESS • SIMILARITY • RELEVANCE FEEDBACK • FEATURE EXTRACTION • HYBRID QUERY GENERATION • ARCHIVE ORGANIZATION • THE ABOVE ISSUES HAVE BEEN PURPOSEFULLY LEFT OPEN FOR INNOVATION
FORGING AHEAD… • INTRODUCTION • MULTIMEDIA APPLICATIONS • IMPACT OF MULTIMEDIA • CONTENT-BASED IMAGE RETRIEVAL (CBIR) • MPEG-7 • RESEARCH ISSUES
RESEARCH ISSUES • SHORTCOMINGS OF CBIR SYSTEMS • ONGOING RESEARCH • RELEVANCE FEEDBACK • HYBRID QUERY GENERATION • DISTRIBUTED MULTIMEDIA INDEXING • OPEN RESEARCH AVENUES
CBIRSHORTCOMINGS-1 • COLOR • USUALLY GLOBAL • HIGH DIMENSIONALITY • GAMMA NONLINEARITIES CAUSE PROBLEMS • SHAPE • COMPLICATED & DIFFICULT • OCCLUSION ISSUES DURING EXTRACTION • TEXTURE • COMPLICATED & UNINTUITIVE • USER-SYSTEM RIFT FOR QUERY CREATION
CBIRSHORTCOMINGS-2 • PERCEPTUAL ISSUES • SUBTLE DIFFERENCES BETWEEN VIEWERS • COLOR-BLIND USERS • SIMILARITY MEASURES • NEED TO BE TUNED TO DESCRIPTORS • e.g. EUCLIDEAN DISTANCE NOT APPLICABLE IN NON-EUCLIDEAN DESCRIPTION SPACE • RELEVANCE FEEDBACK • PERFORMED AT GLOBAL (IMAGE) LEVEL • NEED TO ADDRESS SPECIFIC IMAGE ELEMENTS
ONGOING RESEARCH-2 RELEVANCE FEEDBACK • ITERATIVE QUERY REFINEMENT • PLACE USER IN LOOP TO ITERATIVELY IMPROVE RETRIEVAL RATES • HIGH-DIMENSIONAL SPACE NEEDS PRUNING • EMPHASIZED FEATURE(S) MUST BE FOUND • TYPICAL APPROACHES • STATISTICAL METHODS • FEATURE WEIGHTING
ONGOING RESEARCH-2 RELEVANCE FEEDBACK • FEATURE SELECTIVE INTERFACE • WHY CHOOSE IMAGES ON WHOLE? REQUIRES PROCESSING/STATS TO FIND GOOD FEATURES • USER CAN EXPLICITLY INDICATE ELEMENTS OF IMAGE WHICH ARE GOOD: NO GUESSWORK RELEVANT SHAPE EXPLICIT FEATURES TO R.F. ENGINE RELEVANT COLOR
ONGOING RESEARCH-3 SIMILARITY AGGREGATION/HYBRID QUERIES • TYPICALLY USED APPROACHES • BOOLEAN (AND, OR & NOT OPERATORS) • EUCLIDEAN (MINKOWSKI W/ r=1) • WEIGHTED AVERAGE (WA) i.e. SUPERVECTORS • DISADVANTAGES • EUCLIDEAN: FCN OF DESCRIPTORS – CHANGE DESCRIPTOR, DRASTICALLY ALTER MEASURE • WA: INFLEXIBLE FOR HIGH LEVEL QUERIES, SUPERVECTORS IMPOSE CERTAIN STRUCTURE • BOOLEAN: HARD LIMITED TO LOGIC FCNs • ALL LACK PERCEPTUAL CONSIDERATIONS