基于Transformer局部信息及语法增强架构的中文拼写纠错方法

针对中文拼写纠错,提出两种新的改进方法.其一,在Transformer注意力机制的基础上,添加高斯分布的偏置矩阵,用于提高模型对局部文本的关注程度,加强对错误文本中错误字词和周边文字的信息提取.其二,使用ON_LSTM模型,对错误文本表现出的特殊语法结构特征进行语法信息提取.实验结果表明,所提出的两种方法均能有效提高准确率和召回率,并且,将两种方法融合后的模型取得最高F1值....

Full description

Saved in:
Bibliographic Details
Published in北京大学学报(自然科学版) Vol. 57; no. 1; pp. 61 - 67
Main Authors 段建勇, 袁阳, 王昊
Format Journal Article
LanguageChinese
Published 北方工业大学信息学院,北京 100043 20.01.2021
Subjects
Online AccessGet full text
ISSN0479-8023
DOI10.13209/j.0479-8023.2020.081

Cover

Loading…
Abstract 针对中文拼写纠错,提出两种新的改进方法.其一,在Transformer注意力机制的基础上,添加高斯分布的偏置矩阵,用于提高模型对局部文本的关注程度,加强对错误文本中错误字词和周边文字的信息提取.其二,使用ON_LSTM模型,对错误文本表现出的特殊语法结构特征进行语法信息提取.实验结果表明,所提出的两种方法均能有效提高准确率和召回率,并且,将两种方法融合后的模型取得最高F1值.
AbstractList 针对中文拼写纠错,提出两种新的改进方法.其一,在Transformer注意力机制的基础上,添加高斯分布的偏置矩阵,用于提高模型对局部文本的关注程度,加强对错误文本中错误字词和周边文字的信息提取.其二,使用ON_LSTM模型,对错误文本表现出的特殊语法结构特征进行语法信息提取.实验结果表明,所提出的两种方法均能有效提高准确率和召回率,并且,将两种方法融合后的模型取得最高F1值.
Author 王昊
段建勇
袁阳
AuthorAffiliation 北方工业大学信息学院,北京 100043
AuthorAffiliation_xml – name: 北方工业大学信息学院,北京 100043
Author_FL WANG Hao
YUAN Yang
DUAN Jianyong
Author_FL_xml – sequence: 1
  fullname: DUAN Jianyong
– sequence: 2
  fullname: YUAN Yang
– sequence: 3
  fullname: WANG Hao
Author_xml – sequence: 1
  fullname: 段建勇
– sequence: 2
  fullname: 袁阳
– sequence: 3
  fullname: 王昊
BookMark eNo9j8tKw0AYhWdRwVr7CK5dJf4zf5LJLKV4g4qbui4TMxGDJpAgdqlYFBTqRgt1ES8rN6WIIBoKvkwytW9hQXF14PBxPs4CqURxpAhZomBSZCBWQhMsLgwXGJoMGJjg0gqp_pfzpJ6mBx5QxlzhWLRKtsuHvMh7rURGaRAnRyopX0-n5y_F15M-G5U3V9-joX67K5-zcpzr7F1n3cl9t_gY6v6lvh6XF4NJ_ji9Hej-5wxbJHOBPExV_S9rZHd9rdXYNJo7G1uN1aaRUmDCQEThBZI7iAEooajtKOSKCtfnez5y4QhXogDu-Yq5yubK9j3kAedcWmAB1sjy7-6JjAIZ7bfD-DiJZsa2F_qdjjf7ToECCPwB5Qtpzg
ContentType Journal Article
Copyright Copyright © Wanfang Data Co. Ltd. All Rights Reserved.
Copyright_xml – notice: Copyright © Wanfang Data Co. Ltd. All Rights Reserved.
DBID 2B.
4A8
92I
93N
PSX
TCJ
DOI 10.13209/j.0479-8023.2020.081
DatabaseName Wanfang Data Journals - Hong Kong
WANFANG Data Centre
Wanfang Data Journals
万方数据期刊 - 香港版
China Online Journals (COJ)
China Online Journals (COJ)
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Sciences (General)
DocumentTitle_FL Chinese Spelling Correction Method Based on Transformer Local Information and Syntax Enhancement Architecture
EndPage 67
ExternalDocumentID bjdxxb202101009
GrantInformation_xml – fundername: 国家自然科学基金
  funderid: (61972003,61672040)
GroupedDBID -01
23M
2B.
4A8
5GY
8FE
8FH
92E
92I
93N
AAABJ
AAQEF
ABJNI
ABLSY
ABPYQ
ABUWG
ABVRV
ACECN
ACGFS
ACPRK
ACTRF
ADCJG
ADGMY
ADMLS
ADMQQ
ADRFT
ADZSZ
AENOO
AEXCR
AFKRA
AFSCH
AFTSM
AFZMG
AHIBC
AIVZI
AJZVN
ALMA_UNASSIGNED_HOLDINGS
BBNVY
BENPR
BHPHI
BPHCQ
BVBZV
CCEZO
CCPQU
CCVFK
CW9
HCIFZ
LK8
M7P
P2P
PDI
PHGZM
PHGZT
PMFND
PQQKQ
PSX
TCJ
TGP
U1G
U5K
UY8
ID FETCH-LOGICAL-s1029-3339bfa7633f0e9e156e37e198d7cd379698a3907bde28e57e5db37f777a40403
ISSN 0479-8023
IngestDate Thu May 29 04:00:37 EDT 2025
IsPeerReviewed false
IsScholarly true
Issue 1
Keywords 语法增强
Transformer模型
拼写纠错
局部信息
Language Chinese
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-s1029-3339bfa7633f0e9e156e37e198d7cd379698a3907bde28e57e5db37f777a40403
PageCount 7
ParticipantIDs wanfang_journals_bjdxxb202101009
PublicationCentury 2000
PublicationDate 2021-01-20
PublicationDateYYYYMMDD 2021-01-20
PublicationDate_xml – month: 01
  year: 2021
  text: 2021-01-20
  day: 20
PublicationDecade 2020
PublicationTitle 北京大学学报(自然科学版)
PublicationTitle_FL Acta Scientiarum Naturalium Universitatis Pekinensis
PublicationYear 2021
Publisher 北方工业大学信息学院,北京 100043
Publisher_xml – name: 北方工业大学信息学院,北京 100043
SSID ssib012289641
ssib051370299
ssj0030172
ssib001522812
ssib002258124
ssib000862120
ssib030194702
ssib008143590
ssib002040163
ssib006703675
ssib038076459
Score 2.2927225
Snippet ...
SourceID wanfang
SourceType Aggregation Database
StartPage 61
Title 基于Transformer局部信息及语法增强架构的中文拼写纠错方法
URI https://d.wanfangdata.com.cn/periodical/bjdxxb202101009
Volume 57
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3NaxQxFB9qe_Ei1g_8Zg8GFNl1JpmZJMfMdpYitIi00FuZ2ckqPaxgWyg9KRYFBb1ooR7qBx68lCKC6FLwn-lO7X_he5npTmoX_LgMmXy85L03yfslk7w4zlU3yLy2brsw-tGw7vuJV09SruuC8ixJfakzcw_Z1HQ4OevfmgvmRo59tHYtLS-ljfbq0HMl_6NViAO94inZf9DsgChEQBj0C0_QMDz_SsckDohskUiR2MeniGcOcChoAhIjD3cyxJIIRpQwuVpEeSQOAUES1cI8okUEEBD4qiYwKWJEBpikKJGxodM0lYT4GoVlQPgk5kQqEwDKoiwugTg3VURYEKuASImZgYgy7ZG-iTGZIzmo1IbKpmCTSF4ypwwp5RPFTWCCqNAKhMiFgma3sFIhkCNohlJYL7QQm82xrPSsUpAkTeai1GBVBOkp4NWIIYoM9wEyVNltIzKKckR-BDBQpQDZGHMjg9AMZa-tUFxYqVN30BssRgfygEq54cbIVaphrA9TZpkkUbrQONo8KsUbnvkxa1kAn0vsRMw2V4U_70PdsrA9hVP7EsUUd5wcsY-MGv-yC40B6Qbw7TZc4VWAYLBNM13IVlZSFAzM2vGU7BiFyRiYv7Eonr5959C02KM2DKf0kFs7sBReaMPGAHFlNe6jFzgLxgoE8dXfWw-oybCC7WCTpM8rt3V4Z4LtEynwGKTiNLtAZAzXORCRHfBcnuRDYdwcJgpzgK_bSbp3Law5c9I5UU4Sa6ro8ePOyOq9U854aYYXa9dKX_HXTztT_be93d4Lq9v3Pz_cf_xp98f7_NF2_-Wzn9tb-ZfX_Q-b_Z1evvk131zbe7O2-20rX3-aP9_pP9nY673bf7WRr3-HbGec2VY805yslzek1Bc93LfGGJNpJwGMwDqultoLQs249qTIeDtjXIZSJEy6PM00FTrgOshSxjuc88QHpbCzzmj3flefc2ptlwGBJNEahmj0GaVxdTjlbapDCfTOO7VSJvPlCLg4_9v3ceHPWS46x6t-dskZXXqwrC8Dql9Kr5Qf1S95Abqn
linkProvider ProQuest
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=%E5%9F%BA%E4%BA%8ETransformer%E5%B1%80%E9%83%A8%E4%BF%A1%E6%81%AF%E5%8F%8A%E8%AF%AD%E6%B3%95%E5%A2%9E%E5%BC%BA%E6%9E%B6%E6%9E%84%E7%9A%84%E4%B8%AD%E6%96%87%E6%8B%BC%E5%86%99%E7%BA%A0%E9%94%99%E6%96%B9%E6%B3%95&rft.jtitle=%E5%8C%97%E4%BA%AC%E5%A4%A7%E5%AD%A6%E5%AD%A6%E6%8A%A5%EF%BC%88%E8%87%AA%E7%84%B6%E7%A7%91%E5%AD%A6%E7%89%88%EF%BC%89&rft.au=%E6%AE%B5%E5%BB%BA%E5%8B%87&rft.au=%E8%A2%81%E9%98%B3&rft.au=%E7%8E%8B%E6%98%8A&rft.date=2021-01-20&rft.pub=%E5%8C%97%E6%96%B9%E5%B7%A5%E4%B8%9A%E5%A4%A7%E5%AD%A6%E4%BF%A1%E6%81%AF%E5%AD%A6%E9%99%A2%2C%E5%8C%97%E4%BA%AC+100043&rft.issn=0479-8023&rft.volume=57&rft.issue=1&rft.spage=61&rft.epage=67&rft_id=info:doi/10.13209%2Fj.0479-8023.2020.081&rft.externalDocID=bjdxxb202101009
thumbnail_s http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fwww.wanfangdata.com.cn%2Fimages%2FPeriodicalImages%2Fbjdxxb%2Fbjdxxb.jpg