PandaSearch: A fine-grained academic search engine for research documents

In the world of academia, research documents enable the sharing and dissemination of scientific discoveries. During these "big data" times, academic search engines are widely used to find the relevant research documents. Considering the domain of computer science, a researcher often inputs...

Full description

Saved in:
Bibliographic Details
Published in2015 IEEE 31st International Conference on Data Engineering pp. 1408 - 1411
Main Authors Feiran Huang, Jia Li, Jiaheng Lu, Tok Wang Ling, Zhaoan Dong
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.04.2015
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In the world of academia, research documents enable the sharing and dissemination of scientific discoveries. During these "big data" times, academic search engines are widely used to find the relevant research documents. Considering the domain of computer science, a researcher often inputs a query with a specific goal to find an algorithm or a theorem. However, to this date, the return result of most search engines is just as a list of related papers. Users have to browse the results, download the interesting papers and look for the desired information, which is obviously laborious and inefficient. In this paper, we present a novel academic search system, called PandaSearch, that returns the results with a fine-grained interface, where the results are well organized by different categories, such as definitions, theorems, lemmas, algorithms and figures. The key technical challenges in our system include the automatic identification and extraction of different parts in a research document, the discovery of the main topic phrases for a definition or a theorem, and the recommendation of related definitions or figures to elegantly satisfy the search intention of users. Based on this, we have built a user friendly search interface for users to conveniently explore the documents, and find the relevant information.
ISSN:1063-6382
2375-026X
DOI:10.1109/ICDE.2015.7113388