280 likes | 446 Views
Addressing the Medical Image Annotation Task using visual words representation. Uri Avni , Tel Aviv University, Israel Hayit Greenspan Tel Aviv University, Israel Jacob Goldberger Bar Ilan University, Israel. Outline. Challenge description Proposed system Image representation
E N D
Addressing the Medical Image Annotation Task using visual words representation Uri Avni , Tel Aviv University, Israel Hayit Greenspan Tel Aviv University, Israel Jacob Goldberger Bar Ilan University, Israel
Outline • Challenge description • Proposed system • Image representation • classification • Results • Parameters optimization • Performance analysis • Conclusion
ImageClef 2009 medical annotation challenge 12,677 classified x-ray images, 1733 unknown images Classification according to four labeling sets: • 57 classes • 116 classes • 116 IRMA codes • 196 IRMA codes
IRMA database • Noisy images • Irregular brightness, contrast • Non-uniform class distribution The IRMA group - Aachen University of Technology (RWTH), Germany
IRMA Database - samples • Great intra-class variability • Category #: 1121-230-961-700 Sagittal, Mediolateral, Left hip
IRMA Database - samples Category #1121-110-500-000 overview image posteroanterior (PA) Category #1123-112-500-000 high beam energy posteroanterior (PA),expiration Category #1123-121-500-000 high beam energy anteroposterior (AP),inspiration Category #1121-127-500-000 overview image anteroposterior (AP), supine • Great inter-class similarity
Outline • Challenge description • Proposed system • Image representation • classification • Results • Parameters optimization • Performance analysis • Conclusion
Image representation Image model 0.04 0.02 0 0 100 200 Word number • Move from 2D image to a vector of numbers • Representation should preserve enough information of the image content • Should be not sensitive to translation, artifacts and noise • Compare and classify the compact representation
Patch extraction • Extract raw pixels from patches of fixed size • Dense sampling, ~200,000 patches per image • Normalize intensity, variance • Ignore empty patches • Sample several images – one collection with millions of patches
Feature space description • - Reduce dimension of the collection • Add position (x,y) to the features, position weight is important • 8 dimensional feature vector PCA 6 coefficients 9x9 pixels
Build dictionary • Select k feature vectors as far apart as possible • Run k-means clustering Cluster centers , with x,y Cluster centers
Image representation 0.04 0.02 0 0 50 100 Word number • Scan image – translate patches to words histogram Dictionary Image Probability
Image representation 0 0 150 250 50 100 200 300 • Use multiple scales
Classification • Examine knn classifier, with different distance metrics • One-vs-one multiclass SVM classifier, with n(n-1)/2 binary classifiers • Examine several SVM kernels: • Radial basis function • Chi-square • Histogram intersection
Outline • Our objective • Proposed system • Image representation • Retrieval & classification • Results • Parameters optimization • Performance analysis • Conclusion and future work
Selecting classifier type Effect of histogram distance metric in k-nearest neighbors vs svm classifier Symmetric Kullback – Leibler divergence Jeffery divergence SVM
Selecting feature space Effect of parameters on classification accuracy, using 20 cross-validation experiments with x,y No x,y
Selecting features Selecting type of features - invariance / discriminative power tradeoff * Scale and rotation invariance are not desired
Running time 12,677 images Running on Intel daul quad core Xeon 2.33Ghz
Selecting dictionary Using multiple dictionaries for 3 scales increases classification accuracy by 0.5%
Classification results – effect of kernel Effect of kernel function on SVM classifier, for optimal kernel parameters
Classification results – confusion matrix Confusion matrix of random 2000 test images (2007 labels) 91.95% correct
Submission to ImageClef 2009 medical annotation task • One run submitted • Use the same classifier for the 4 label sets (2005,2006,2007,2008) • Ignore IRMA code hierarchy • Don’t use wildcards
Conclusion & future work • Using visual words with simple features and dense sampling is efficient and accurate in general x-ray annotation • We are applying the system to pathology classifications of chest x-rays, together with Sheba Medical Center Healthy Enlarged heart Lung filtrate Left+righteffusion