310 likes | 340 Views
This research combines efficient object localization and image classification methods to enhance performance. The study explores the localization of objects within images and the classification of those objects. It presents a framework that efficiently localizes objects using a sliding window approach and combines various image representations, such as histograms, gradients, and features, for robust performance. By analyzing the results of experiments that combine image classification and localization, the study provides insights into the effectiveness of the fusion approach. The proposed combination model considers the detectability of objects by different modalities, classification, and detection, to improve the overall accuracy. Experimental results demonstrate the efficiency and accuracy of the fusion model in handling both detectable and undetectable objects. The study evaluates the performance using the PASCAL VOC dataset and compares it with state-of-the-art methods.
E N D
Combining efficient object localization and image classification H. Harzallah, F. Jurie and C. Schmid LEAR, INRIA Grenoble, LJK
Tasks • Image classification: assigning labels to the image Car: present Cow: present Bike: not present Horse: not present …
Cow Car Tasks • Image classification: assigning labels to the image Car: present Cow: present Bike: not present Horse: not present … • Object localization: define the location and the category Location Category
Contributions • Object class localization method • Combining image classification and object localization Localization--Classification++ Localization++Classification--
Overview • Related work and datasets • Efficient object localization • Experimental results • Combining image classification and localization • Experimental results • Conclusion
Related work • Object localization • Sliding window [Dalal06] [Rowley95] • Implicit shape model [Leibe04] • SVM classifiers [Chum07] [Ferrari08] • Cascade of classifiers [Viola01] [Vedaldi09] • Context information • Combination of context sources [Divvala09] • Graphical model of events in images [Li07] • Local segmentation + global classification [Shotton08] [Heitz08]
PASCAL VOC dataset • PASCAL VOC dataset 2007 and 2008 • Two tasks : classification and localization • Fixed train/test set-up for the 20 object classes • Standard evaluation measure • Area of overlap as detection matching criterion • Average precision for performance evaluation
Overview • Related work and datasets • Efficient object localization • Experimental results • Combining image classification and localization • Experimental results • Conclusion
Efficient object localization Sliding window based approach Image representation Combination of features Extensive parameters evaluation Robust image representation Efficient search strategy
Image representation Histogram Histogram Histogram Histogram Histogram Histogram • Combination of 2 image representations • Histogram Oriented Gradient • Gradient based features • Integral Histograms • Bag of Features • SIFT features extracted densely + k-means clustering • Pyramidal representation of the sliding windows • One histogram per tile
Efficient search strategy • Reduce search complexity • Sliding windows: huge number of candidate windows • Cascades: pros/cons • Two stage cascade: • Filtering classifier with a linear SVM • Low computational cost • Evaluation: capacity of rejecting negative windows • Scoring classifier with a non-linear SVM • Χ2 kernel with a channel combination [Zhang07] • Significant increase of performance
Efficiency of the 2 stage localization • Performance w. resp. to nbr of windows selected by the linear SVM (mAP on Pascal 2007) • Sliding windows: 100k candidate windows • A small number of windows are enough after filtering
Localization performance • Mean Average Precision on all 20 classes • PASCAL 2007 dataset
Localization examples: correct localizations Bicycle Car Horse Sofa
Localization examples: false positives Bicycle Car Horse Sofa
Localization examples: missed objects Bicycle Car Horse Sofa
Overview • Related work and datasets • Efficient object localization • Experimental results • Combining image classification and localization • Experimental results • Conclusion
Image classification & localization use a different information Combination: key points • For many TP only one has a high score • Truncated objects: hard for the detector • Small objects: ok for the detector but not for the classifier using global information
Combination model • Input: classification ( Si ) and localization ( Sw ) scores • Output: probability that object is present • Suppose that classification and localization outputs are independent:
Combination model • For each modality (classification/detection): notion of detectability P(Di) for classifier and P(Dw) for detector • Encodes the ability to detect presence of the objects • Assuming that the classifier/detector outputs conditional probabilities: P(O|Di,Si) and P(O|Dw,Sw)
Combination model • P (O |Si) = P(Di) × P(O|Si, Di) + P(¬Di) × P(O|Si,¬Di) • P (O |Sw) = P(Dw) × P(O|Sw, Dw) + P(¬Dw) × P(O|Si,¬Dw) • Final probability: • Handle both cases: • Object detectable by two modalities • Object detectable by only one modality
Combination model • P(O|¬Di,Si) and P(O|¬Di,Si) : constant value • Sw = classification by localization: highest localization score • Priors P(Di) and P(Dw) class dependant
Combination experimental setup • Image classifier : INRIA_flat classifier • SVM classifier Χ2 kernel using multiple feature channels [Zhang07] • Excellent results in PASCAL 2008 challenge • Detector : as described previously • Experimental validation on PASCAL VOC 2007 • Comparison to the state of the art on PASCAL VOC 2008
Experimental results : gain obtained Classification Localization
Experimental results Car localization • Correct but low score localization • High classification score • score increased after combination
Experimental results Car classification • High classification score • No localization • score decreased after combination
Comparison to the state of the art • Based on blind evaluation on PASCAL VOC 2008 • Classification • Best on 12 classes out of 20 • Localization • Best on 11 classes out of 20
Conclusion • Efficient localization method • Successful combination of classification and localization • State of the art performance on both tasks