Research of Dimension Reduction Algorithms of Feature Space in Data Analysis Task

Today data describing a particular subject area, often contain a large number of different features that determine the properties of any processes or objects. At the same time, these data sets can reach enormous sizes, so working with data turns into very resource-intensive and long processes. Reduc...

Full description

Saved in:
Bibliographic Details
Published inIEEE NW Russia Young Researchers in Electrical and Electronic Engineering Conference pp. 458 - 462
Main Authors Popov, Nikita V., Klionskiy, Dmitry M.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.01.2020
Subjects
Online AccessGet full text
ISSN2376-6565
DOI10.1109/EIConRus49466.2020.9039259

Cover

More Information
Summary:Today data describing a particular subject area, often contain a large number of different features that determine the properties of any processes or objects. At the same time, these data sets can reach enormous sizes, so working with data turns into very resource-intensive and long processes. Reducing the dimension of the feature space entails a reduction in the used memory and getting rid of noisy and duplicate information. To reduce the space, various machine-learning algorithms are used, each of which has its own individual degree of efficiency and validity for its area of application. Client databases are no exception, they often contain large amounts of information, some of which are redundant. Therefore, reducing the space of signs in client data is a topic whose relevance is only increasing every day. This paper presents a study of such methods as reduction of features with low variability, Univariate feature selection, recursive feature elimination, selection based on a decision tree, principal component analysis. Based on the results of the study, requirements are formulated for an algorithm for reducing the dimension of the feature space specifically for the task of working with client data.
ISSN:2376-6565
DOI:10.1109/EIConRus49466.2020.9039259