Optimization of the algorithm for identifying digital traces of schoolchildren in the Altai Territory

The use of digital traces of social networks users has gained great popularity in various studies with the development of methods for analyzing big data. When processing data from social networks users, the problem of the incompleteness of the provided information arises (age, educational institutio...

Full description

Saved in:
Bibliographic Details
Published inJournal of physics. Conference series Vol. 1615; no. 1; pp. 12013 - 12019
Main Authors Zhuravleva, V V, Manicheva, A S, Feshchenko, A V, Berestov, A V
Format Journal Article
LanguageEnglish
Published Bristol IOP Publishing 01.08.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The use of digital traces of social networks users has gained great popularity in various studies with the development of methods for analyzing big data. When processing data from social networks users, the problem of the incompleteness of the provided information arises (age, educational institution, year of admission/graduation are not specified). Users with such gaps do not fall into the field of view of the university as a result the number of potential applicants is significantly reduced. The aim of the project is to develop an algorithm for restoring information in a digital trace of a social networks user and its application to identify the group affiliation of schoolchildren whose digital trace contains incomplete information on the grade and place of study. The study was carried out on the data of schoolchildren from "The VKontakte" social network, corresponding to the ninth and eleventh grades, which were divided into four groups. The analysis of unique community subscriptions revealed significant differences in the age groups of schoolchildren. Then, based on gradient boosting, the algorithm that allows restoring missing information in the digital trace of schoolchildren was built. The optimization of the parameters of this algorithm based on numerical experiments allowed obtaining the precision of the order 0.6. The algorithm was used to identify groups of schoolchildren in 9th and 11th grades, in the digital trace of which there was incomplete information. The practical significance of the project is to expand the target audience of future university applicants and, as a result, the opportunity for the university to help them using career guidance measures in social networks to choose an educational program more consciously.
ISSN:1742-6588
1742-6596
DOI:10.1088/1742-6596/1615/1/012013