Active Learning for Ranking through Expected Loss Optimization

Learning to rank arises in many data mining applications, ranging from web search engine, online advertising to recommendation system. In learning to rank, the performance of a ranking model is strongly affected by the number of labeled examples in the training set; on the other hand, obtaining labe...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on knowledge and data engineering Vol. 27; no. 5; pp. 1180 - 1191
Main Authors	Bo Long, Jiang Bian, Chapelle, Olivier, Ya Zhang, Inagaki, Yoshiyuki, Yi Chang
Format	Journal Article
Language	English
Published	New York IEEE 01.05.2015 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Active Learning Algorithms Bayes methods Electronic mail Expected Loss Optimization Gain Learning Mathematical models Optimization Query processing Ranking Teaching methods Training Training data Uncertainty Web search Active learning expected loss optimization ranking
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Learning to rank arises in many data mining applications, ranging from web search engine, online advertising to recommendation system. In learning to rank, the performance of a ranking model is strongly affected by the number of labeled examples in the training set; on the other hand, obtaining labeled examples for training data is very expensive and time-consuming. This presents a great need for the active learning approaches to select most informative examples for ranking learning; however, in the literature there is still very limited work to address active learning for ranking. In this paper, we propose a general active learning framework, expected loss optimization (ELO), for ranking. The ELO framework is applicable to a wide range of ranking functions. Under this framework, we derive a novel algorithm, expected discounted cumulative gain (DCG) loss optimization (ELO-DCG), to select most informative examples. Then, we investigate both query and document level active learning for raking and propose a two-stage ELO-DCG algorithm which incorporate both query and document selection into active learning. Furthermore, we show that it is flexible for the algorithm to deal with the skewed grade distribution problem with the modification of the loss function. Extensive experiments on real-world web search data sets have demonstrated great potential and effectiveness of the proposed framework and algorithms.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1041-4347 1558-2191
DOI:	10.1109/TKDE.2014.2365785