Method for detecting outlier data from large-scale high dimensional data based on genetic algorithm
The invention discloses a method for detecting outlier data from large-scale high dimensional data based on a genetic algorithm, and belongs to the technical field of outlier data mining. The method comprises the steps of (1) sample discretization and encoding, namely encoding the high dimensional d...
Saved in:
Main Authors | , , |
---|---|
Format | Patent |
Language | English |
Published |
11.03.2015
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The invention discloses a method for detecting outlier data from large-scale high dimensional data based on a genetic algorithm, and belongs to the technical field of outlier data mining. The method comprises the steps of (1) sample discretization and encoding, namely encoding the high dimensional data and enabling each individual to correspond to one character string, selecting a sparse coefficient as a fitness function and taking the coefficient as a criterion for judging whether the individuals are good or bad, (2) loop iteration, namely maintaining a group which comprises a plurality of individuals and updating the group continuously by use of crossing, mutation and selection according to the principle of survival of the fittest, and (3) decoding to obtain the outlier data, namely decoding the group obtained at last by corresponding to the corresponding sample data and then finding the hidden outlier data in the sample data. The method for detecting the outlier data from the large-scale high dimensional data based on the genetic algorithm is capable of effectively and quickly finding out the hidden outlier data from the large-scale high dimensional data. |
---|---|
Bibliography: | Application Number: CN201410689745 |