A two-stage information retrieval system based on interactive multimodal genetic algorithm for query weight optimization

Query weight optimization, which aims to find an optimal combination of the weights of query terms for sorting relevant documents, is an important topic in the information retrieval system. Due to the huge search space, the query optimization problem is intractable, and evolutionary algorithms have...

Full description

Saved in:
Bibliographic Details
Published inComplex & intelligent systems Vol. 7; no. 5; pp. 2765 - 2781
Main Authors Cong, Hao, Chen, Wei-Neng, Yu, Wei-Jie
Format Journal Article
LanguageEnglish
Published Cham Springer International Publishing 01.10.2021
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Query weight optimization, which aims to find an optimal combination of the weights of query terms for sorting relevant documents, is an important topic in the information retrieval system. Due to the huge search space, the query optimization problem is intractable, and evolutionary algorithms have become one popular approach. But as the size of the database grows, traditional retrieval approaches may return a lot of results, which leads to low efficiency and poor practicality. To solve this problem, this paper proposes a two-stage information retrieval system based on an interactive multimodal genetic algorithm (IMGA) for a query weight optimization system. The proposed IMGA has two stages: quantity control and quality optimization. In the quantity control stage, a multimodal genetic algorithm with the aid of the niching method selects multiple promising combinations of query terms simultaneously by which the numbers of retrieved documents are controlled in an appropriate range. In the quality optimization stage, an interactive genetic algorithm is designed to find the optimal query weights so that the most user-friendly document retrieval sequence can be yielded. Users’ feedback information will accelerate the optimization process, and a genetic algorithm (GA) performs interactively with the action of relevance feedback mechanism. Replacing user evaluation, a mathematical model is built to evaluate the fitness values of individuals. In the proposed two-stage method, not only the number of returned results can be controlled, but also the quality and accuracy of retrieval can be improved. The proposed method is run on the database which with more than 2000 documents. The experimental results show that our proposed method outperforms several state-of-the-art query weight optimization approaches in terms of the precision rate and the recall rate.
ISSN:2199-4536
2198-6053
DOI:10.1007/s40747-021-00450-6