基于奇异值阈值理论的电力营销数据在线清洗方法
TM743; 能源互联网架构下,电力营销大数据是支撑智能电网众多高级应用的关键基础,数据清洗对于电力营销大数据更是极为重要.然而,数据缺失问题会不可避免地出现在实际电网运行环节中,严重影响数据的分析和使用.针对上述问题,文章以Spark大数据在线处理平台为基础,提出了融合相似用户聚类和奇异值阈值理论的在线数据清洗框架和方法.借助奇异值分解,证明了电力营销数据具有近似低秩特性.以此为基础,考虑电力用户的用电差异,提出了一种融合改进K最近邻算法和奇异值阈值理论的在线数据清洗框架和方法.同时,针对奇异值阈值模型计算缓慢问题,提出采用滑动时间窗在线修复策略,加快修复速度,提升修复精度.最后,通过河北省...
Saved in:
Published in | 电测与仪表 Vol. 61; no. 9; pp. 120 - 126 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | Chinese |
Published |
国网河北省电力有限公司营销服务中心,石家庄 050021%国网河北省电力有限公司,石家庄 050021%西安交通大学智能网络与网络安全教育部重点实验室,西安 710049
15.09.2024
|
Subjects | |
Online Access | Get full text |
ISSN | 1001-1390 |
DOI | 10.19753/j.issn1001-1390.2024.09.016 |
Cover
Abstract | TM743; 能源互联网架构下,电力营销大数据是支撑智能电网众多高级应用的关键基础,数据清洗对于电力营销大数据更是极为重要.然而,数据缺失问题会不可避免地出现在实际电网运行环节中,严重影响数据的分析和使用.针对上述问题,文章以Spark大数据在线处理平台为基础,提出了融合相似用户聚类和奇异值阈值理论的在线数据清洗框架和方法.借助奇异值分解,证明了电力营销数据具有近似低秩特性.以此为基础,考虑电力用户的用电差异,提出了一种融合改进K最近邻算法和奇异值阈值理论的在线数据清洗框架和方法.同时,针对奇异值阈值模型计算缓慢问题,提出采用滑动时间窗在线修复策略,加快修复速度,提升修复精度.最后,通过河北省某电力营销数据验证了所提算法的有效性,实验结果显示该在线修复算法能够更快速、高效地修复大规模电力营销缺省数据. |
---|---|
AbstractList | TM743; 能源互联网架构下,电力营销大数据是支撑智能电网众多高级应用的关键基础,数据清洗对于电力营销大数据更是极为重要.然而,数据缺失问题会不可避免地出现在实际电网运行环节中,严重影响数据的分析和使用.针对上述问题,文章以Spark大数据在线处理平台为基础,提出了融合相似用户聚类和奇异值阈值理论的在线数据清洗框架和方法.借助奇异值分解,证明了电力营销数据具有近似低秩特性.以此为基础,考虑电力用户的用电差异,提出了一种融合改进K最近邻算法和奇异值阈值理论的在线数据清洗框架和方法.同时,针对奇异值阈值模型计算缓慢问题,提出采用滑动时间窗在线修复策略,加快修复速度,提升修复精度.最后,通过河北省某电力营销数据验证了所提算法的有效性,实验结果显示该在线修复算法能够更快速、高效地修复大规模电力营销缺省数据. |
Abstract_FL | Under the framework of energy Internet,power marketing big data is the foundation to support many ad-vanced applications of smart grid,and data cleaning is extremely important for power marketing big data.However,the data missing problem will inevitably appear in the actual operation of power grid,which greatly affects the anal-ysis and use of data.Aiming at the above problem,this paper proposes an online data cleaning framework and method based on spark platform,which combines similar user clustering and singular value thresholding theory.Firstly,with the help of singular value decomposition,it is proved that the power marketing data has the character-istics of approximate low rank.On this basis,considering the power consumption difference of power users,this pa-per proposes an online data cleaning framework and method which integrates the improved K-nearest neighbor clus-tering and the theory of singular value thresholding.Meanwhile,in order to solve the problem of slow calculation of singular value thresholding model,a sliding time window online recovery strategy is proposed to accelerate the re-pair speed and improve the recovery accuracy.Finally,the effectiveness of the proposed algorithm is verified by power marketing data of Hebei Province.The experimental results show that the online recovery algorithm can re-pair the large-scale missing data of power marketing more quickly and effectively. |
Author | 吴宏波 李骥 马红明 马浩 杨迪 刘家丞 |
AuthorAffiliation | 国网河北省电力有限公司营销服务中心,石家庄 050021%国网河北省电力有限公司,石家庄 050021%西安交通大学智能网络与网络安全教育部重点实验室,西安 710049 |
AuthorAffiliation_xml | – name: 国网河北省电力有限公司营销服务中心,石家庄 050021%国网河北省电力有限公司,石家庄 050021%西安交通大学智能网络与网络安全教育部重点实验室,西安 710049 |
Author_FL | MA Hongming MA Hao LI Ji LIU Jiacheng YANG Di WU Hongbo |
Author_FL_xml | – sequence: 1 fullname: MA Hongming – sequence: 2 fullname: MA Hao – sequence: 3 fullname: YANG Di – sequence: 4 fullname: WU Hongbo – sequence: 5 fullname: LIU Jiacheng – sequence: 6 fullname: LI Ji |
Author_xml | – sequence: 1 fullname: 马红明 – sequence: 2 fullname: 马浩 – sequence: 3 fullname: 杨迪 – sequence: 4 fullname: 吴宏波 – sequence: 5 fullname: 刘家丞 – sequence: 6 fullname: 李骥 |
BookMark | eNo9j0tLw0AcxPdQwVr7LcRb4n93k30ctfiCgpfeSza7lRZJwSDSWxFfCEoPrYoWeip4KoIKGgp-GTdpv4URxcsMzOE3M0uoELUjg9AKBhdL7tO1ltuM4wgDYAdTCS4B4rkgXcCsgIr_-SIqx3FTgY8p9xiQItqwo-QrubHjCzs9sd3p_P4y16x3Ppsk2cNp1n-zV4-z3nje76aD5_R6YodPWfKZvp-lr3fp7Uf6MlhGC43gIDblPy-h2tZmrbLjVPe2dyvrVSfG-QxHasE1cMY8JYkJfCI05SaUDKjikoCEUPlSaIMV9fL9HDQTHASWgWEipCW0-os9DqJGEO3XW-2jwygvrOuw01E_j3MGZvQbkRhl9Q |
ClassificationCodes | TM743 |
ContentType | Journal Article |
Copyright | Copyright © Wanfang Data Co. Ltd. All Rights Reserved. |
Copyright_xml | – notice: Copyright © Wanfang Data Co. Ltd. All Rights Reserved. |
DBID | 2B. 4A8 92I 93N PSX TCJ |
DOI | 10.19753/j.issn1001-1390.2024.09.016 |
DatabaseName | Wanfang Data Journals - Hong Kong WANFANG Data Centre Wanfang Data Journals 万方数据期刊 - 香港版 China Online Journals (COJ) China Online Journals (COJ) |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
DocumentTitle_FL | An online data cleaning method for power marketing data based on singular value thresholding theory |
EndPage | 126 |
ExternalDocumentID | dcyyb202409016 |
GrantInformation_xml | – fundername: 国家自然科学基金 funderid: (61773308) |
GroupedDBID | -03 2B. 4A8 5XA 5XD 92H 92I 93N ABJNI ACGFS ADMLS ALMA_UNASSIGNED_HOLDINGS CCEZO CEKLB CW9 GROUPED_DOAJ PSX TCJ TGT U1G U5M |
ID | FETCH-LOGICAL-s1016-9d87d07664b92ea528d37ec9603b792090cb598de1b3420270d6870819ae68c3 |
ISSN | 1001-1390 |
IngestDate | Thu May 29 03:55:45 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 9 |
Keywords | 电力营销数据 奇异值阈值算法 data cleaning 缺省数据恢复 power marketing data missing data recovery singular value thresholding algorithm 数据清洗 |
Language | Chinese |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-s1016-9d87d07664b92ea528d37ec9603b792090cb598de1b3420270d6870819ae68c3 |
PageCount | 7 |
ParticipantIDs | wanfang_journals_dcyyb202409016 |
PublicationCentury | 2000 |
PublicationDate | 2024-09-15 |
PublicationDateYYYYMMDD | 2024-09-15 |
PublicationDate_xml | – month: 09 year: 2024 text: 2024-09-15 day: 15 |
PublicationDecade | 2020 |
PublicationTitle | 电测与仪表 |
PublicationTitle_FL | Electrical Measurement & Instrumentation |
PublicationYear | 2024 |
Publisher | 国网河北省电力有限公司营销服务中心,石家庄 050021%国网河北省电力有限公司,石家庄 050021%西安交通大学智能网络与网络安全教育部重点实验室,西安 710049 |
Publisher_xml | – name: 国网河北省电力有限公司营销服务中心,石家庄 050021%国网河北省电力有限公司,石家庄 050021%西安交通大学智能网络与网络安全教育部重点实验室,西安 710049 |
SSID | ssib051374602 ssj0039791 ssib001129792 |
Score | 2.3668408 |
Snippet | TM743;... |
SourceID | wanfang |
SourceType | Aggregation Database |
StartPage | 120 |
Title | 基于奇异值阈值理论的电力营销数据在线清洗方法 |
URI | https://d.wanfangdata.com.cn/periodical/dcyyb202409016 |
Volume | 61 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwvV3Na9RAFA-1BdGD-ImftYfOqaQmk8xk5uBhspuliPVihd5KssnqaQXbHtpTEb8QlB5aFRU8FTwVQQVdCv4z7m77X_jeS3Y3_fDzICzDMG_mvd_72Oyb2ZmJZY17ScoT5bt2I3Gl7UNM2HHd5XaG_-L4su4mCk8jT9-QU7f8a7NidmjkamnX0uJCMllfPvBcyb94FdrAr3hK9i8822cKDVAH_0IJHobyj3zMIsF0jYWGRT6WKsIWIxjM8KESVpjiWFEO1iPNtGJKlVoCph2mJIsUMxHxgRbg41MFeArqbJgOsQ90BubIx0cOkWQapFBFVZED4qkwo3A4cAtrSApBqKCKz3RAo6CuqcVj-csve_lxSa7EUoWkmiLVoBIyYwitC1J6wYKIDHwqhVjDSQgOOqAL8TV6QIGuVcKsELAxA4pAjQE1GjViqlZANry8VsJ93NiRnxal6KZxIQurBKfKtEvjOA5Fa1bICgFaSrkHGVoSSZOhNUJAkiDwAlGE6ifOqKAbcj5goNxwhlpAMeXB14nE1Qok6HJJVHT5hCMwFWNc_A8F_hiKQi0xkHIf6CLUjY_ccZuOIZLPDMU8qGskIdAUzwr13qVJgFEE-AZxtZ_Ul5XjVkWoA09kyNEWKN0jksavmyIRSNLFcB1R0Bm0V94CIFHtvRpN4NVX-Y26RXqAGxBhzuSU84f8ZQLFc1KXkgGXO6W80s1vhtiXsuDJdspZUEJfwCSGL93A7O65KZ5yz7S-tJRgFwfyeXnIGuFBgLtURkx1-vrNwXwIsvlgsG1CuF7gS6d_vx7-p08rUj2ph63xHqQrvwBEBxybjbh5u5SLzxy3jhWT6DGTPxFPWEPLd05aR0tXq56ywva71vfW8_bG4_bW_fbK1s6rJ1B2Vx9tb7a6rx901z63n77ZXt3YWVvprH_oPNtsv33fbX3rfHnY-fSy8-Jr5-P6aWumFs1UpuzifTH2PK5B2jpVQeoEUvqJ5lksuEq9IKtr6XgJWAFMVU-EVmnmJp6Pi75OKiFdgTlRnElV985Yw827zeysNSaEbDR4lnowf4WkRsUwj_OyJI5lBgKEd866XBhgrvg5mJ_b7ZLzv-1xwToyeERdtIYX7i1ml2CCs5CMFm4cpQXCH39_6r4 |
linkProvider | EBSCOhost |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=%E5%9F%BA%E4%BA%8E%E5%A5%87%E5%BC%82%E5%80%BC%E9%98%88%E5%80%BC%E7%90%86%E8%AE%BA%E7%9A%84%E7%94%B5%E5%8A%9B%E8%90%A5%E9%94%80%E6%95%B0%E6%8D%AE%E5%9C%A8%E7%BA%BF%E6%B8%85%E6%B4%97%E6%96%B9%E6%B3%95&rft.jtitle=%E7%94%B5%E6%B5%8B%E4%B8%8E%E4%BB%AA%E8%A1%A8&rft.au=%E9%A9%AC%E7%BA%A2%E6%98%8E&rft.au=%E9%A9%AC%E6%B5%A9&rft.au=%E6%9D%A8%E8%BF%AA&rft.au=%E5%90%B4%E5%AE%8F%E6%B3%A2&rft.date=2024-09-15&rft.pub=%E5%9B%BD%E7%BD%91%E6%B2%B3%E5%8C%97%E7%9C%81%E7%94%B5%E5%8A%9B%E6%9C%89%E9%99%90%E5%85%AC%E5%8F%B8%E8%90%A5%E9%94%80%E6%9C%8D%E5%8A%A1%E4%B8%AD%E5%BF%83%2C%E7%9F%B3%E5%AE%B6%E5%BA%84+050021%25%E5%9B%BD%E7%BD%91%E6%B2%B3%E5%8C%97%E7%9C%81%E7%94%B5%E5%8A%9B%E6%9C%89%E9%99%90%E5%85%AC%E5%8F%B8%2C%E7%9F%B3%E5%AE%B6%E5%BA%84+050021%25%E8%A5%BF%E5%AE%89%E4%BA%A4%E9%80%9A%E5%A4%A7%E5%AD%A6%E6%99%BA%E8%83%BD%E7%BD%91%E7%BB%9C%E4%B8%8E%E7%BD%91%E7%BB%9C%E5%AE%89%E5%85%A8%E6%95%99%E8%82%B2%E9%83%A8%E9%87%8D%E7%82%B9%E5%AE%9E%E9%AA%8C%E5%AE%A4%2C%E8%A5%BF%E5%AE%89+710049&rft.issn=1001-1390&rft.volume=61&rft.issue=9&rft.spage=120&rft.epage=126&rft_id=info:doi/10.19753%2Fj.issn1001-1390.2024.09.016&rft.externalDocID=dcyyb202409016 |
thumbnail_s | http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fwww.wanfangdata.com.cn%2Fimages%2FPeriodicalImages%2Fdcyyb%2Fdcyyb.jpg |