A complement to lexical query’s search-term selection for emerging technologies: the case of “big data”

Obtaining document sets to study emerging technologies is challenging. Researchers studying emerging technologies use lexical queries, e.g., core, expanded and evolutionary, to face this challenge. Creating lexical queries requires the selection of search-terms. Manual, automatic and semi-automatic...

Full description

Saved in:
Bibliographic Details
Published inScientometrics Vol. 117; no. 1; pp. 141 - 162
Main Authors Ruiz-Navas, Santiago, Miyazaki, Kumiko
Format Journal Article
LanguageEnglish
Published Cham Springer International Publishing 01.10.2018
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Obtaining document sets to study emerging technologies is challenging. Researchers studying emerging technologies use lexical queries, e.g., core, expanded and evolutionary, to face this challenge. Creating lexical queries requires the selection of search-terms. Manual, automatic and semi-automatic techniques can be implemented to select search-terms. The current reported processes to select search-terms can be complemented by attending two issues. One is the lack of a systematic process for the selection of search-terms from previous literature, and the second is the evaluation of candidate search-terms’ document retrieval interdependence. We propose two steps to complement the process of selecting search-terms to create lexical queries to study emerging technologies. The first step consists of a process to systematically select search-terms from previous literature. The second is an evaluation of search-terms’ document retrieval interdependence, and for its evaluation, we propose the Significance of Interception Ratio (SIR). We tested our proposed steps setting as a reference the big-data lexical query proposed by Huang et al. (Scientometrics 105:2005–2022, 2015 ). The tests results show that the proposed steps can complement the current automatic methods to select search-terms. The first step increased around a 24% the recall of the reference lexical query. The increase in the recall was possible because of the addition of 37 additional search-terms and the elimination of three search-terms from the reference lexical query. In the second step (application of the SIR), five search-terms from the reference lexical query were optimized, showing a slight complementary ability when selecting search-terms.
ISSN:0138-9130
1588-2861
DOI:10.1007/s11192-018-2857-9