160 likes | 295 Views
A Study of Learning a Merge Model for Multilingual Information Retrieval. Presenter : Cheng- Hui Chen Author : Ming- Feng Tsai , Yu-Ting Wang, Hsin-Hsi Chen SIGIR 2008. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation.
E N D
A Study of Learning a Merge Model for Multilingual Information Retrieval Presenter: Cheng-Hui Chen Author: Ming-Feng Tsai, Yu-Ting Wang, Hsin-Hsi Chen SIGIR 2008
Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments
Motivation • Multilingual information retrieval(MLIR) that result list usually includes more irrelevant words. • Traditional merging methods for MLIR that assumption relevant documents are homogeneously distributedover monolingual result lists.
Objectives • The various translation and retrieval qualities in different collections that to merge a unique result list. • To proposes merge method doesn’t assumption relevant documents are homogeneously distributed over monolingual result lists. • The enhancement merge model quality.
Methodology • Traditional MLIR Framework. • Raw-score • Round-robin • Normalized-by-top1 • Normalized-by-topk • The Proposes a learning method. • FRank
MLIR merge process • Feature Set • Query levels • Document levels • Translation levels • The Construction of a Merge Model • FRank ranking algorithm • BM25
Feature set • Query levels • The manually classify the terms within a query into several pre-defined categories. • Location/country names (Loc) • Organization names (Org) • Event names (EN) • Technical terms (TT) • Document levels • The extracted document length (Dlength) and title length (Tlength).
Feature set Loc 斗六 EN 英->中 Order、Park Loc EN 食べる • Translation levels • The size of a bilingual dictionary used for various language (i.e., DictSize). • The average number of translation equivalents within a query (i.e., AvgTAD). • If a query has two query terms both with three translation equivalents. • AvgTAD of the query is (3 + 3)/2 = 3.
The Construction of Merge model • The FRank’s generalized additive model, a merge model can be represented as : • mt(x) is a weak learner • αtis the learned weight • t is the number of selected weak learners • The combine with a retrevalmodel (bm25) by using linear combination .
Experiments • Data set • The Details of Experimental Collections • The Percentage of Retrieved Documents
Experiments Mean Average Precision (MAP)
Experiments The Experimental Results of Our Method using Different Combination Coefficient λ.
Experiments Feature Analysis
Conclusions The proposed merge model can significantly improve merging quality. The merge model indicates the key factors are the number of translatable terms and compound words.
Conclusions • The future work • Use other learning-based ranking algorithms. • Such as RankSVM and RankNet. • Extract more representative features to construct a merge model. • Such as linguistic features. • Expect to discover more relations within query terms. • Such as query term association and substitution.
Comments • Advantage • Improve merging quality. • Drawback • Application • Multilingual Information retrieval.