Predicting scientific impact based on h-index

Predicting the future impact of a scientist/researcher is a critical task. The objective of this work is to evaluate different h -index prediction models for the field of Computer Science. Different combinations of parameters have been identified to build the model and applied on a large data set ta...

Full description

Saved in:
Bibliographic Details
Published inScientometrics Vol. 114; no. 3; pp. 993 - 1010
Main Authors Ayaz, Samreen, Masood, Nayyer, Islam, Muhammad Arshad
Format Journal Article
LanguageEnglish
Published Dordrecht Springer Netherlands 01.03.2018
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Predicting the future impact of a scientist/researcher is a critical task. The objective of this work is to evaluate different h -index prediction models for the field of Computer Science. Different combinations of parameters have been identified to build the model and applied on a large data set taken from Arnetminer comprised of almost 1.8 million authors and 2.1 million publications’ record of Computer Science. Machine learning prediction technique, regression, is used to find the best set of parameters suitable for h -index prediction for the scientists from all career ages, without enforcing any constraint on their current h -index values with R 2 as a metric to measure the accuracy. Further, these parameters are evaluated for different career ages and different thresholds for h -index values. Prediction results for 1 year are really good, having R 2 0.93 but for 5 years R 2 declines to 0.82 on average. Hence inferred that prediction of h -index is difficult for longer periods. Predictions for the researchers having 1 year experience are not precise, having R 2 0.60 for 1 year and 0.33 for 5 years. Considering scientists of different career ages, average R 2 values for researchers having 20–36 years of experience were 0.99. For the researches having different h -index values, researchers having low h -index were difficult to predict. Parameters set comprising of current h -index, average citations per paper, number of coauthors, years since publishing first article, number of publications, number of impact factor publications, and number of publications in distinct journals performed better than all other combinations.
ISSN:0138-9130
1588-2861
DOI:10.1007/s11192-017-2618-1