Research of Dimension Reduction Algorithms of Feature Space in Data Analysis Task

Today data describing a particular subject area, often contain a large number of different features that determine the properties of any processes or objects. At the same time, these data sets can reach enormous sizes, so working with data turns into very resource-intensive and long processes. Reduc...

Full description

Saved in:
Bibliographic Details
Published inIEEE NW Russia Young Researchers in Electrical and Electronic Engineering Conference pp. 458 - 462
Main Authors Popov, Nikita V., Klionskiy, Dmitry M.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.01.2020
Subjects
Online AccessGet full text
ISSN2376-6565
DOI10.1109/EIConRus49466.2020.9039259

Cover

Abstract Today data describing a particular subject area, often contain a large number of different features that determine the properties of any processes or objects. At the same time, these data sets can reach enormous sizes, so working with data turns into very resource-intensive and long processes. Reducing the dimension of the feature space entails a reduction in the used memory and getting rid of noisy and duplicate information. To reduce the space, various machine-learning algorithms are used, each of which has its own individual degree of efficiency and validity for its area of application. Client databases are no exception, they often contain large amounts of information, some of which are redundant. Therefore, reducing the space of signs in client data is a topic whose relevance is only increasing every day. This paper presents a study of such methods as reduction of features with low variability, Univariate feature selection, recursive feature elimination, selection based on a decision tree, principal component analysis. Based on the results of the study, requirements are formulated for an algorithm for reducing the dimension of the feature space specifically for the task of working with client data.
AbstractList Today data describing a particular subject area, often contain a large number of different features that determine the properties of any processes or objects. At the same time, these data sets can reach enormous sizes, so working with data turns into very resource-intensive and long processes. Reducing the dimension of the feature space entails a reduction in the used memory and getting rid of noisy and duplicate information. To reduce the space, various machine-learning algorithms are used, each of which has its own individual degree of efficiency and validity for its area of application. Client databases are no exception, they often contain large amounts of information, some of which are redundant. Therefore, reducing the space of signs in client data is a topic whose relevance is only increasing every day. This paper presents a study of such methods as reduction of features with low variability, Univariate feature selection, recursive feature elimination, selection based on a decision tree, principal component analysis. Based on the results of the study, requirements are formulated for an algorithm for reducing the dimension of the feature space specifically for the task of working with client data.
Author Klionskiy, Dmitry M.
Popov, Nikita V.
Author_xml – sequence: 1
  givenname: Nikita V.
  surname: Popov
  fullname: Popov, Nikita V.
  organization: Saint Petersburg Electrotechnical University "LETI",Department of Software Engineering and Computer Applications,St. Petersburg,Russia
– sequence: 2
  givenname: Dmitry M.
  surname: Klionskiy
  fullname: Klionskiy, Dmitry M.
  organization: Saint Petersburg Electrotechnical University "LETI",Department of Software Engineering and Computer Applications,St. Petersburg,Russia
BookMark eNotkM1OAjEURqvRRECewE3jfvC2M_1bEhAlITEirsmlvSNV6JDpsODtlcjqfIuTb3H67CY1iRh7FDASAtzT83zSpOUxV67SeiRBwshB6aRyV6wvjLRCGS3MNevJ0uhCK63u2DDnbwCQUrg_o8fel5QJW7_lTc2ncU8pxybxJYWj785rvPtq2tht9_lszAi7Y0v844CeeEx8ih3yccLdKcfMV5h_7tltjbtMwwsH7HP2vJq8Fou3l_lkvCiiELYrKisMAFmP1SYAGV1760UwCAFdiQodOguCnPBA6EBV5GtLwW1C2KgSywF7-P-NRLQ-tHGP7Wl9KVD-AssGVLk
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/EIConRus49466.2020.9039259
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 1728157617
9781728157610
1728157609
9781728157603
EISSN 2376-6565
EndPage 462
ExternalDocumentID 9039259
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
OCL
RIE
RIL
ID FETCH-LOGICAL-i118t-481700e8ca4bd0e76fc8c1d7a0da93a5a9a9801e91c0ea9054ecf8ed9bddb53a3
IEDL.DBID RIE
IngestDate Wed Aug 27 02:33:14 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i118t-481700e8ca4bd0e76fc8c1d7a0da93a5a9a9801e91c0ea9054ecf8ed9bddb53a3
PageCount 5
ParticipantIDs ieee_primary_9039259
PublicationCentury 2000
PublicationDate 2020-Jan.
PublicationDateYYYYMMDD 2020-01-01
PublicationDate_xml – month: 01
  year: 2020
  text: 2020-Jan.
PublicationDecade 2020
PublicationTitle IEEE NW Russia Young Researchers in Electrical and Electronic Engineering Conference
PublicationTitleAbbrev EIConRus
PublicationYear 2020
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0002219728
Score 1.7512734
Snippet Today data describing a particular subject area, often contain a large number of different features that determine the properties of any processes or objects....
SourceID ieee
SourceType Publisher
StartPage 458
SubjectTerms Classification algorithms
client's data analysis
Decision trees
dimension of feature space
Feature extraction
feature selection
Libraries
Machine learning
Machine learning algorithms
Principal component analysis
python
Title Research of Dimension Reduction Algorithms of Feature Space in Data Analysis Task
URI https://ieeexplore.ieee.org/document/9039259
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ07T8MwEMetthMsPFrEWx4YSeo8G4-oDxWkIiit1K062xeoCglqk4VPj52k5SEGlsiKosjyWfr7zne_I-QqUI6ASIElmdQOigqYBcD0o8PQkCHDsmfk6D4cTv27WTCrkettLQwiFslnaJthcZevUpmbUFmbM63mAa-Tut5mZa3WNp7iuqaBVlRxRR3G2_3bbpqM87VvEOraE3SZXf3gRyeVQkgGe2S0mUKZP7K080zY8uMXnfG_c9wnra-SPfqwFaMDUsPkkOx-ow02yeMmy46mMe0Zqr-JlNGxgbca89Cb1-d0tche3tbmC3M4zFdIn7RXjXSR0B5kQDcQEzqB9bJFpoP-pDu0qo4K1kI7EpnlFzg-jCT4QjHshLGMpKM6wBRwDwLgwLVkIXckQ-D6OIcyjlBxoZQIPPCOSCNJEzwmVHHwvVi5YSyYj5xxCcJnICMVhNoJwxPSNIszfy-hGfNqXU7_fn1GdoyBytjGOWlkqxwvtNpn4rIw8yfrAatC
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ3LT8JAEMY3iAf14gOMb_fg0ZalL7pHwyOoQBQh4Uamu1MlaGugvfjXu9sWfMSDl6Zpmmazc_j2m878hpArV9YD8CUYggllUKTLDACmLg2Gmgzp5TMj-wOvO3buJu6kRK7XvTCImBWfoalvs3_5MhapTpXVOFNq7vINsql033Hzbq11RsWy9AgtvyCL1hmvtW-bcTRMl46GqCsvaDGz-MSPWSqZlHR2SX-1iLyCZG6mSWCKj198xv-uco9Uv5r26MNajvZJCaMDsvONN1ghj6s6OxqHtKW5_jpXRoca36oDRG9en-PFLHl5W-o39PEwXSB9Ur4a6SyiLUiArjAmdATLeZWMO-1Rs2sUMxWMmbISieFkQD70BTiBZNjwQuGLumwAk8BtcIEDV6KFvC4YAlcHOhShj5IHUgauDfYhKUdxhEeESg6OHUrLCwPmIGdcQOAwEL50PWXD8JhU9OZM33NsxrTYl5O_H1-Sre6o35v2bgf3p2RbByvPdJyRcrJI8VxpfxJcZCH_BBDJro8
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=IEEE+NW+Russia+Young+Researchers+in+Electrical+and+Electronic+Engineering+Conference&rft.atitle=Research+of+Dimension+Reduction+Algorithms+of+Feature+Space+in+Data+Analysis+Task&rft.au=Popov%2C+Nikita+V.&rft.au=Klionskiy%2C+Dmitry+M.&rft.date=2020-01-01&rft.pub=IEEE&rft.eissn=2376-6565&rft.spage=458&rft.epage=462&rft_id=info:doi/10.1109%2FEIConRus49466.2020.9039259&rft.externalDocID=9039259