Rapid data aggregation method for big data cleaning

The invention discloses a rapid data aggregation method for big data cleaning. The method comprises the following steps: data reading: storing original data in Excel, reading data information in the Excel in a file stream form, storing the read data information in a record list according to the form...

Full description

Saved in:
Bibliographic Details
Main Authors GENG ZHAOYANG, WANG KANGPING, SHI XIAOHU, WANG YIZHANG, ZHOU YOU, WU CHUNGUO
Format Patent
LanguageChinese
English
Published 03.09.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The invention discloses a rapid data aggregation method for big data cleaning. The method comprises the following steps: data reading: storing original data in Excel, reading data information in the Excel in a file stream form, storing the read data information in a record list according to the format of the data, and finally returning the record list; segmenting the big data text; performing textsimilarity comparison; and displaying and modifying an aggregation result: printing out the form to be displayed and providing the form for a user to modify and delete, and downloading the form afterthe modification is completed. 本发明公开了一种用于大数据清洗的快速数据聚合方法,包括以下步骤:数据读取:原有的数据是在Excel中存储的,利用文件流的形式读取出Excel中的数据信息,根据数据的格式,将读取出来的数据信息存储在记录列表中,最后返回一个记录列表;对大数据文本进行切分;进行文本相似度比较;聚合结果的显示和修改:将要显示的表单打印出来并且提供给用户修改和删除,修改完成后,进行表单的下载。
Bibliography:Application Number: CN201910501539