基于奇异值阈值理论的电力营销数据在线清洗方法

TM743; 能源互联网架构下,电力营销大数据是支撑智能电网众多高级应用的关键基础,数据清洗对于电力营销大数据更是极为重要.然而,数据缺失问题会不可避免地出现在实际电网运行环节中,严重影响数据的分析和使用.针对上述问题,文章以Spark大数据在线处理平台为基础,提出了融合相似用户聚类和奇异值阈值理论的在线数据清洗框架和方法.借助奇异值分解,证明了电力营销数据具有近似低秩特性.以此为基础,考虑电力用户的用电差异,提出了一种融合改进K最近邻算法和奇异值阈值理论的在线数据清洗框架和方法.同时,针对奇异值阈值模型计算缓慢问题,提出采用滑动时间窗在线修复策略,加快修复速度,提升修复精度.最后,通过河北省...

Full description

Saved in:
Bibliographic Details
Published in电测与仪表 Vol. 61; no. 9; pp. 120 - 126
Main Authors 马红明, 马浩, 杨迪, 吴宏波, 刘家丞, 李骥
Format Journal Article
LanguageChinese
Published 国网河北省电力有限公司营销服务中心,石家庄 050021%国网河北省电力有限公司,石家庄 050021%西安交通大学智能网络与网络安全教育部重点实验室,西安 710049 15.09.2024
Subjects
Online AccessGet full text
ISSN1001-1390
DOI10.19753/j.issn1001-1390.2024.09.016

Cover

Abstract TM743; 能源互联网架构下,电力营销大数据是支撑智能电网众多高级应用的关键基础,数据清洗对于电力营销大数据更是极为重要.然而,数据缺失问题会不可避免地出现在实际电网运行环节中,严重影响数据的分析和使用.针对上述问题,文章以Spark大数据在线处理平台为基础,提出了融合相似用户聚类和奇异值阈值理论的在线数据清洗框架和方法.借助奇异值分解,证明了电力营销数据具有近似低秩特性.以此为基础,考虑电力用户的用电差异,提出了一种融合改进K最近邻算法和奇异值阈值理论的在线数据清洗框架和方法.同时,针对奇异值阈值模型计算缓慢问题,提出采用滑动时间窗在线修复策略,加快修复速度,提升修复精度.最后,通过河北省某电力营销数据验证了所提算法的有效性,实验结果显示该在线修复算法能够更快速、高效地修复大规模电力营销缺省数据.
AbstractList TM743; 能源互联网架构下,电力营销大数据是支撑智能电网众多高级应用的关键基础,数据清洗对于电力营销大数据更是极为重要.然而,数据缺失问题会不可避免地出现在实际电网运行环节中,严重影响数据的分析和使用.针对上述问题,文章以Spark大数据在线处理平台为基础,提出了融合相似用户聚类和奇异值阈值理论的在线数据清洗框架和方法.借助奇异值分解,证明了电力营销数据具有近似低秩特性.以此为基础,考虑电力用户的用电差异,提出了一种融合改进K最近邻算法和奇异值阈值理论的在线数据清洗框架和方法.同时,针对奇异值阈值模型计算缓慢问题,提出采用滑动时间窗在线修复策略,加快修复速度,提升修复精度.最后,通过河北省某电力营销数据验证了所提算法的有效性,实验结果显示该在线修复算法能够更快速、高效地修复大规模电力营销缺省数据.
Abstract_FL Under the framework of energy Internet,power marketing big data is the foundation to support many ad-vanced applications of smart grid,and data cleaning is extremely important for power marketing big data.However,the data missing problem will inevitably appear in the actual operation of power grid,which greatly affects the anal-ysis and use of data.Aiming at the above problem,this paper proposes an online data cleaning framework and method based on spark platform,which combines similar user clustering and singular value thresholding theory.Firstly,with the help of singular value decomposition,it is proved that the power marketing data has the character-istics of approximate low rank.On this basis,considering the power consumption difference of power users,this pa-per proposes an online data cleaning framework and method which integrates the improved K-nearest neighbor clus-tering and the theory of singular value thresholding.Meanwhile,in order to solve the problem of slow calculation of singular value thresholding model,a sliding time window online recovery strategy is proposed to accelerate the re-pair speed and improve the recovery accuracy.Finally,the effectiveness of the proposed algorithm is verified by power marketing data of Hebei Province.The experimental results show that the online recovery algorithm can re-pair the large-scale missing data of power marketing more quickly and effectively.
Author 吴宏波
李骥
马红明
马浩
杨迪
刘家丞
AuthorAffiliation 国网河北省电力有限公司营销服务中心,石家庄 050021%国网河北省电力有限公司,石家庄 050021%西安交通大学智能网络与网络安全教育部重点实验室,西安 710049
AuthorAffiliation_xml – name: 国网河北省电力有限公司营销服务中心,石家庄 050021%国网河北省电力有限公司,石家庄 050021%西安交通大学智能网络与网络安全教育部重点实验室,西安 710049
Author_FL MA Hongming
MA Hao
LI Ji
LIU Jiacheng
YANG Di
WU Hongbo
Author_FL_xml – sequence: 1
  fullname: MA Hongming
– sequence: 2
  fullname: MA Hao
– sequence: 3
  fullname: YANG Di
– sequence: 4
  fullname: WU Hongbo
– sequence: 5
  fullname: LIU Jiacheng
– sequence: 6
  fullname: LI Ji
Author_xml – sequence: 1
  fullname: 马红明
– sequence: 2
  fullname: 马浩
– sequence: 3
  fullname: 杨迪
– sequence: 4
  fullname: 吴宏波
– sequence: 5
  fullname: 刘家丞
– sequence: 6
  fullname: 李骥
BookMark eNo9j0tLw0AcxPdQwVr7LcRb4n93k30ctfiCgpfeSza7lRZJwSDSWxFfCEoPrYoWeip4KoIKGgp-GTdpv4URxcsMzOE3M0uoELUjg9AKBhdL7tO1ltuM4wgDYAdTCS4B4rkgXcCsgIr_-SIqx3FTgY8p9xiQItqwo-QrubHjCzs9sd3p_P4y16x3Ppsk2cNp1n-zV4-z3nje76aD5_R6YodPWfKZvp-lr3fp7Uf6MlhGC43gIDblPy-h2tZmrbLjVPe2dyvrVSfG-QxHasE1cMY8JYkJfCI05SaUDKjikoCEUPlSaIMV9fL9HDQTHASWgWEipCW0-os9DqJGEO3XW-2jwygvrOuw01E_j3MGZvQbkRhl9Q
ClassificationCodes TM743
ContentType Journal Article
Copyright Copyright © Wanfang Data Co. Ltd. All Rights Reserved.
Copyright_xml – notice: Copyright © Wanfang Data Co. Ltd. All Rights Reserved.
DBID 2B.
4A8
92I
93N
PSX
TCJ
DOI 10.19753/j.issn1001-1390.2024.09.016
DatabaseName Wanfang Data Journals - Hong Kong
WANFANG Data Centre
Wanfang Data Journals
万方数据期刊 - 香港版
China Online Journals (COJ)
China Online Journals (COJ)
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
DocumentTitle_FL An online data cleaning method for power marketing data based on singular value thresholding theory
EndPage 126
ExternalDocumentID dcyyb202409016
GrantInformation_xml – fundername: 国家自然科学基金
  funderid: (61773308)
GroupedDBID -03
2B.
4A8
5XA
5XD
92H
92I
93N
ABJNI
ACGFS
ADMLS
ALMA_UNASSIGNED_HOLDINGS
CCEZO
CEKLB
CW9
GROUPED_DOAJ
PSX
TCJ
TGT
U1G
U5M
ID FETCH-LOGICAL-s1016-9d87d07664b92ea528d37ec9603b792090cb598de1b3420270d6870819ae68c3
ISSN 1001-1390
IngestDate Thu May 29 03:55:45 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 9
Keywords 电力营销数据
奇异值阈值算法
data cleaning
缺省数据恢复
power marketing data
missing data recovery
singular value thresholding algorithm
数据清洗
Language Chinese
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-s1016-9d87d07664b92ea528d37ec9603b792090cb598de1b3420270d6870819ae68c3
PageCount 7
ParticipantIDs wanfang_journals_dcyyb202409016
PublicationCentury 2000
PublicationDate 2024-09-15
PublicationDateYYYYMMDD 2024-09-15
PublicationDate_xml – month: 09
  year: 2024
  text: 2024-09-15
  day: 15
PublicationDecade 2020
PublicationTitle 电测与仪表
PublicationTitle_FL Electrical Measurement & Instrumentation
PublicationYear 2024
Publisher 国网河北省电力有限公司营销服务中心,石家庄 050021%国网河北省电力有限公司,石家庄 050021%西安交通大学智能网络与网络安全教育部重点实验室,西安 710049
Publisher_xml – name: 国网河北省电力有限公司营销服务中心,石家庄 050021%国网河北省电力有限公司,石家庄 050021%西安交通大学智能网络与网络安全教育部重点实验室,西安 710049
SSID ssib051374602
ssj0039791
ssib001129792
Score 2.3668408
Snippet TM743;...
SourceID wanfang
SourceType Aggregation Database
StartPage 120
Title 基于奇异值阈值理论的电力营销数据在线清洗方法
URI https://d.wanfangdata.com.cn/periodical/dcyyb202409016
Volume 61
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwvV3Na9RAFA-1BdGD-ImftYfOqaQmk8xk5uBhspuliPVihd5KssnqaQXbHtpTEb8QlB5aFRU8FTwVQQVdCv4z7m77X_jeS3Y3_fDzICzDMG_mvd_72Oyb2ZmJZY17ScoT5bt2I3Gl7UNM2HHd5XaG_-L4su4mCk8jT9-QU7f8a7NidmjkamnX0uJCMllfPvBcyb94FdrAr3hK9i8822cKDVAH_0IJHobyj3zMIsF0jYWGRT6WKsIWIxjM8KESVpjiWFEO1iPNtGJKlVoCph2mJIsUMxHxgRbg41MFeArqbJgOsQ90BubIx0cOkWQapFBFVZED4qkwo3A4cAtrSApBqKCKz3RAo6CuqcVj-csve_lxSa7EUoWkmiLVoBIyYwitC1J6wYKIDHwqhVjDSQgOOqAL8TV6QIGuVcKsELAxA4pAjQE1GjViqlZANry8VsJ93NiRnxal6KZxIQurBKfKtEvjOA5Fa1bICgFaSrkHGVoSSZOhNUJAkiDwAlGE6ifOqKAbcj5goNxwhlpAMeXB14nE1Qok6HJJVHT5hCMwFWNc_A8F_hiKQi0xkHIf6CLUjY_ccZuOIZLPDMU8qGskIdAUzwr13qVJgFEE-AZxtZ_Ul5XjVkWoA09kyNEWKN0jksavmyIRSNLFcB1R0Bm0V94CIFHtvRpN4NVX-Y26RXqAGxBhzuSU84f8ZQLFc1KXkgGXO6W80s1vhtiXsuDJdspZUEJfwCSGL93A7O65KZ5yz7S-tJRgFwfyeXnIGuFBgLtURkx1-vrNwXwIsvlgsG1CuF7gS6d_vx7-p08rUj2ph63xHqQrvwBEBxybjbh5u5SLzxy3jhWT6DGTPxFPWEPLd05aR0tXq56ywva71vfW8_bG4_bW_fbK1s6rJ1B2Vx9tb7a6rx901z63n77ZXt3YWVvprH_oPNtsv33fbX3rfHnY-fSy8-Jr5-P6aWumFs1UpuzifTH2PK5B2jpVQeoEUvqJ5lksuEq9IKtr6XgJWAFMVU-EVmnmJp6Pi75OKiFdgTlRnElV985Yw827zeysNSaEbDR4lnowf4WkRsUwj_OyJI5lBgKEd866XBhgrvg5mJ_b7ZLzv-1xwToyeERdtIYX7i1ml2CCs5CMFm4cpQXCH39_6r4
linkProvider EBSCOhost
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=%E5%9F%BA%E4%BA%8E%E5%A5%87%E5%BC%82%E5%80%BC%E9%98%88%E5%80%BC%E7%90%86%E8%AE%BA%E7%9A%84%E7%94%B5%E5%8A%9B%E8%90%A5%E9%94%80%E6%95%B0%E6%8D%AE%E5%9C%A8%E7%BA%BF%E6%B8%85%E6%B4%97%E6%96%B9%E6%B3%95&rft.jtitle=%E7%94%B5%E6%B5%8B%E4%B8%8E%E4%BB%AA%E8%A1%A8&rft.au=%E9%A9%AC%E7%BA%A2%E6%98%8E&rft.au=%E9%A9%AC%E6%B5%A9&rft.au=%E6%9D%A8%E8%BF%AA&rft.au=%E5%90%B4%E5%AE%8F%E6%B3%A2&rft.date=2024-09-15&rft.pub=%E5%9B%BD%E7%BD%91%E6%B2%B3%E5%8C%97%E7%9C%81%E7%94%B5%E5%8A%9B%E6%9C%89%E9%99%90%E5%85%AC%E5%8F%B8%E8%90%A5%E9%94%80%E6%9C%8D%E5%8A%A1%E4%B8%AD%E5%BF%83%2C%E7%9F%B3%E5%AE%B6%E5%BA%84+050021%25%E5%9B%BD%E7%BD%91%E6%B2%B3%E5%8C%97%E7%9C%81%E7%94%B5%E5%8A%9B%E6%9C%89%E9%99%90%E5%85%AC%E5%8F%B8%2C%E7%9F%B3%E5%AE%B6%E5%BA%84+050021%25%E8%A5%BF%E5%AE%89%E4%BA%A4%E9%80%9A%E5%A4%A7%E5%AD%A6%E6%99%BA%E8%83%BD%E7%BD%91%E7%BB%9C%E4%B8%8E%E7%BD%91%E7%BB%9C%E5%AE%89%E5%85%A8%E6%95%99%E8%82%B2%E9%83%A8%E9%87%8D%E7%82%B9%E5%AE%9E%E9%AA%8C%E5%AE%A4%2C%E8%A5%BF%E5%AE%89+710049&rft.issn=1001-1390&rft.volume=61&rft.issue=9&rft.spage=120&rft.epage=126&rft_id=info:doi/10.19753%2Fj.issn1001-1390.2024.09.016&rft.externalDocID=dcyyb202409016
thumbnail_s http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fwww.wanfangdata.com.cn%2Fimages%2FPeriodicalImages%2Fdcyyb%2Fdcyyb.jpg