Syntax-Based Translation With Bilingually Lexicalized Synchronous Tree Substitution Grammars

Syntax-based models can significantly improve the translation performance due to their grammatical modeling on one or both language side(s). However, the translation rules such as the non-lexical rule " VP→(x 0 x 1 ,VP:x 1 PP:x 0 )" in string-to-tree models do not consider any lexicalized...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on audio, speech, and language processing Vol. 21; no. 8; pp. 1586 - 1597
Main Authors Zhang, Jiajun, Zhai, Feifei, Zong, Chengqing
Format Journal Article
LanguageEnglish
Published Piscataway, NJ IEEE 01.08.2013
Institute of Electrical and Electronics Engineers
Subjects
Online AccessGet full text
ISSN1558-7916
1558-7924
DOI10.1109/TASL.2013.2255283

Cover

Abstract Syntax-based models can significantly improve the translation performance due to their grammatical modeling on one or both language side(s). However, the translation rules such as the non-lexical rule " VP→(x 0 x 1 ,VP:x 1 PP:x 0 )" in string-to-tree models do not consider any lexicalized information on the source or target side. The rule is so generalized that any subtree rooted at VP can substitute for the nonterminal VP:x 1 . Because rules containing nonterminals are frequently used when generating the target-side tree structures, there is a risk that rules of this type will potentially be severely misused in decoding due to a lack of lexicalization guidance. In this article, inspired by lexicalized PCFG, which is widely used in monolingual parsing, we propose to upgrade the STSG (synchronous tree substitution grammars)-based syntax translation model with bilingually lexicalized STSG. Using the string-to-tree translation model as a case study, we present generative and discriminative models to integrate lexicalized STSG into the translation model. Both small- and large-scale experiments on Chinese-to-English translation demonstrate that the proposed lexicalized STSG can provide superior rule selection in decoding and substantially improve the translation quality.
AbstractList Syntax-based models can significantly improve the translation performance due to their grammatical modeling on one or both language side(s). However, the translation rules such as the non-lexical rule " VP→(x 0 x 1 ,VP:x 1 PP:x 0 )" in string-to-tree models do not consider any lexicalized information on the source or target side. The rule is so generalized that any subtree rooted at VP can substitute for the nonterminal VP:x 1 . Because rules containing nonterminals are frequently used when generating the target-side tree structures, there is a risk that rules of this type will potentially be severely misused in decoding due to a lack of lexicalization guidance. In this article, inspired by lexicalized PCFG, which is widely used in monolingual parsing, we propose to upgrade the STSG (synchronous tree substitution grammars)-based syntax translation model with bilingually lexicalized STSG. Using the string-to-tree translation model as a case study, we present generative and discriminative models to integrate lexicalized STSG into the translation model. Both small- and large-scale experiments on Chinese-to-English translation demonstrate that the proposed lexicalized STSG can provide superior rule selection in decoding and substantially improve the translation quality.
Author Chengqing Zong
Jiajun Zhang
Feifei Zhai
Author_xml – sequence: 1
  givenname: Jiajun
  surname: Zhang
  fullname: Zhang, Jiajun
– sequence: 2
  givenname: Feifei
  surname: Zhai
  fullname: Zhai, Feifei
– sequence: 3
  givenname: Chengqing
  surname: Zong
  fullname: Zong, Chengqing
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=27572195$$DView record in Pascal Francis
BookMark eNp9kDtPwzAUhS1UJNrCD0AsWRhT_IiTeGwrKEiRGFrEgmQ5zg01SpPKdqWWX4_7oAMD073D-Y7OOQPUa7sWELoleEQIFg-L8bwYUUzYiFLOac4uUJ9wnseZoEnv_JP0Cg2c-8I4YWlC-uhjvmu92sYT5aCKFla1rlHedG30bvwympjGtJ8b1TS7qICt0aox30EYKL20XdttXIAAovmmdN74zQGdWbVaKeuu0WWtGgc3pztEb0-Pi-lzXLzOXqbjItZUcB8ndVmzEmhFKiFSoXJVakhISjXXVV7yulIZxilkWOc1CF0Bz3XOOdMakxIYG6L7o-9auZCwDi20cXJtTUixkzTjGSWCB1121GnbOWehltr4Q1tvlWkkwXI_ptyPKfdjytOYgSR_yF_z_5i7I2MA4KxPE4ExEewHJuyEuQ
CODEN ITASD8
CitedBy_id crossref_primary_10_1145_2699927
crossref_primary_10_3390_math10060914
Cites_doi 10.3115/1699571.1699607
10.3115/1626431.1626459
10.1145/2025384.2025386
10.3115/1075178.1075217
10.3115/1219840.1219899
10.3115/1220175.1220230
10.3115/1073012.1073079
10.3115/1620754.1620786
10.3115/1220175.1220241
10.3115/1690219.1690225
10.3115/1610075.1610083
10.1162/089120103322753356
10.3115/1220175.1220252
10.3115/1219840.1219874
10.3115/1220175.1220296
10.3115/1631828.1631829
10.1162/coli.2007.33.2.201
10.3115/1220835.1220868
10.3115/1626355.1626361
10.3115/1273073.1273195
10.3115/1613715.1613745
ContentType Journal Article
Copyright 2014 INIST-CNRS
Copyright_xml – notice: 2014 INIST-CNRS
DBID 97E
RIA
RIE
AAYXX
CITATION
IQODW
DOI 10.1109/TASL.2013.2255283
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Pascal-Francis
DatabaseTitle CrossRef
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Applied Sciences
EISSN 1558-7924
EndPage 1597
ExternalDocumentID 27572195
10_1109_TASL_2013_2255283
6490019
Genre orig-research
GroupedDBID 0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
AETIX
AGQYO
AGSQL
AHBIQ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
F5P
HZ~
IFIPE
IPLJI
JAVBF
LAI
M43
O9-
OCL
RIA
RIE
RNS
AAYXX
CITATION
RIG
IQODW
ID FETCH-LOGICAL-c295t-4fbf3be2d1d9969a8abce4162c5cd8b5fda7006e70c8fe9cde58c8553cc01be33
IEDL.DBID RIE
ISSN 1558-7916
IngestDate Wed Apr 02 07:26:42 EDT 2025
Tue Jul 01 01:00:10 EDT 2025
Thu Apr 24 23:12:03 EDT 2025
Tue Aug 26 16:39:18 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 8
Keywords Performance evaluation
Discriminant analysis
Bilingually lexicalized synchronous tree substitution grammars
syntax-based statistical machine translation
Syntactic analysis
Tree structure
Decoding
Formal grammar
Modeling
Case study
Guidance
English
Language processing
Information source
Automatic translation
Chinese
generative model
Syntax
discriminative model
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c295t-4fbf3be2d1d9969a8abce4162c5cd8b5fda7006e70c8fe9cde58c8553cc01be33
PageCount 12
ParticipantIDs crossref_citationtrail_10_1109_TASL_2013_2255283
crossref_primary_10_1109_TASL_2013_2255283
ieee_primary_6490019
pascalfrancis_primary_27572195
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2013-08-01
PublicationDateYYYYMMDD 2013-08-01
PublicationDate_xml – month: 08
  year: 2013
  text: 2013-08-01
  day: 01
PublicationDecade 2010
PublicationPlace Piscataway, NJ
PublicationPlace_xml – name: Piscataway, NJ
PublicationTitle IEEE transactions on audio, speech, and language processing
PublicationTitleAbbrev TASL
PublicationYear 2013
Publisher IEEE
Institute of Electrical and Electronics Engineers
Publisher_xml – name: IEEE
– name: Institute of Electrical and Electronics Engineers
References ref13
liu (ref15) 2006
ref12
zhai (ref38) 2012
ref37
shen (ref22) 2008
ref31
ref30
och (ref18) 2004
liu (ref14) 2010
ref32
ref10
post (ref20) 2009
ref2
koehn (ref11) 2004
ref17
zhang (ref29) 2011
ref16
galley (ref8) 2006
xiong (ref26) 2006
zollmann (ref36) 2011
chiang (ref5) 2006
wu (ref23) 2010
zhang (ref34) 2004
ref24
xie (ref25) 2011
zhang (ref33) 2011
ref21
zhai (ref39) 2012
charniak (ref1) 2003
chiang (ref4) 2010
ref28
ref27
ref7
ref3
ref6
galley (ref9) 2004
zhang (ref35) 2008
petrov (ref19) 2006
References_xml – start-page: 273
  year: 2004
  ident: ref9
  article-title: What's in a translation rule
  publication-title: Proceedings of NAACL'04
– ident: ref37
  doi: 10.3115/1699571.1699607
– start-page: 835
  year: 2011
  ident: ref29
  article-title: Binarized forest to string translation
  publication-title: Proc ACL '11
– start-page: 40
  year: 2003
  ident: ref1
  article-title: Syntax-based language models for statistical machine translation
  publication-title: Proc MT Summit IX
– start-page: 261
  year: 2012
  ident: ref39
  article-title: Simple but effective approaches to improving tree-to-tree model
  publication-title: Proc MT Summit XIII
– start-page: 577
  year: 2008
  ident: ref22
  article-title: A new string to dependency machine translation algorithm with a target dependency language model
  publication-title: Proc ACL-08 HLT
– ident: ref12
  doi: 10.3115/1626431.1626459
– start-page: 1443
  year: 2010
  ident: ref4
  article-title: Learning to translate with source and target syntax
  publication-title: Proc of the 40th ACL
– ident: ref24
  doi: 10.1145/2025384.2025386
– year: 2004
  ident: ref34
  publication-title: Maximum Entropy Modeling Toolkit for Python and C++
– start-page: 22
  year: 2010
  ident: ref23
  article-title: Effective use of function words for rule generalization in forest-based translation
  publication-title: Proc ACL '10
– ident: ref7
  doi: 10.3115/1075178.1075217
– start-page: 216
  year: 2011
  ident: ref25
  article-title: A novel dependency-to-string model for statistical machine translation
  publication-title: Proc EMNLP '11
– ident: ref31
  doi: 10.3115/1219840.1219899
– start-page: 1
  year: 2011
  ident: ref36
  article-title: A word-class approach to labeling PSCFG rules for machine translation
  publication-title: Proc ACL '11
– start-page: 433
  year: 2006
  ident: ref19
  article-title: Learning accurate, compact, and interpretable tree annotation
  publication-title: Proc COLING-ACL '06
  doi: 10.3115/1220175.1220230
– start-page: 1
  year: 2009
  ident: ref20
  article-title: Language modeling with tree substitution grammars
  publication-title: Proc NISP Workshop Grammar Induct Represent of Lang Lang Learn
– ident: ref28
  doi: 10.3115/1073012.1073079
– ident: ref3
  doi: 10.3115/1620754.1620786
– start-page: 521
  year: 2006
  ident: ref26
  article-title: Maximum entropy based phrase reordering model for statistical machine translation
  publication-title: Proc COLING-ACL '06
  doi: 10.3115/1220175.1220241
– start-page: 388
  year: 2004
  ident: ref11
  article-title: Statistical significance tests for machine translation evaluation
  publication-title: Proc EMNLP'04
– ident: ref13
  doi: 10.3115/1690219.1690225
– start-page: 204
  year: 2011
  ident: ref33
  article-title: Augmenting string-to-tree translation models with fuzzy use of source-side syntax
  publication-title: Proc EMNLP '11
– year: 2006
  ident: ref5
  article-title: An introduction to synchronous grammars
  publication-title: Tutorial on ACL-06
– start-page: 3037
  year: 2012
  ident: ref38
  article-title: Tree-based translation without using parse trees
  publication-title: Proc COLING '12
– ident: ref16
  doi: 10.3115/1610075.1610083
– ident: ref6
  doi: 10.1162/089120103322753356
– start-page: 609
  year: 2006
  ident: ref15
  article-title: Tree-to-string alignment template for statistical machine translation
  publication-title: Proc COLING-ACL '06
  doi: 10.3115/1220175.1220252
– ident: ref21
  doi: 10.3115/1219840.1219874
– start-page: 707
  year: 2010
  ident: ref14
  article-title: Joint parsing and translation
  publication-title: Proc COLING/ACL
– start-page: 961
  year: 2006
  ident: ref8
  article-title: Scalable inference and training of context-rich syntactic translation models
  publication-title: Proc COLING-ACL '06
  doi: 10.3115/1220175.1220296
– ident: ref10
  doi: 10.3115/1631828.1631829
– ident: ref2
  doi: 10.1162/coli.2007.33.2.201
– start-page: 559
  year: 2008
  ident: ref35
  article-title: A tree sequence alignment-based tree-to-tree translation model
  publication-title: Proc ACL-08 HLT
– ident: ref30
  doi: 10.3115/1220835.1220868
– start-page: 161
  year: 2004
  ident: ref18
  article-title: A smorgasbord of features for statistical machine translation
  publication-title: Proceedings of NAACL'04
– ident: ref27
  doi: 10.3115/1626355.1626361
– ident: ref32
  doi: 10.3115/1273073.1273195
– ident: ref17
  doi: 10.3115/1613715.1613745
SSID ssj0043641
Score 2.0746696
Snippet Syntax-based models can significantly improve the translation performance due to their grammatical modeling on one or both language side(s). However, the...
SourceID pascalfrancis
crossref
ieee
SourceType Index Database
Enrichment Source
Publisher
StartPage 1586
SubjectTerms Adaptation models
Applied sciences
Bilingually lexicalized synchronous tree substitution grammars
Coding, codes
Decoding
discriminative model
Exact sciences and technology
generative model
Grammar
Information, signal and communications theory
Miscellaneous
Reliability
Signal and communications theory
Signal processing
Syntactics
syntax-based statistical machine translation
Telecommunications and information theory
Training
Vegetation
Title Syntax-Based Translation With Bilingually Lexicalized Synchronous Tree Substitution Grammars
URI https://ieeexplore.ieee.org/document/6490019
Volume 21
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8QwEB7Ukx58i-tj6cGT2LXbJG1ydMUHol5U9CCUZJLi4rIrbgvqrzfTdhcVEW-FZkroN81Mmm--AdiLlImVljq0ivOQ56kJdWSTUNqUxci4REEFzlfXyfkdv3gQDzNwMK2Fcc5V5DPXocvqLN-OsKRfZYcJV5SSzMKsd7O6Vmuy6nKW8FobVUiSYEyaE8xupA5vj24uicTFOt55SczkWwyqmqoQJVKP_VvJ63YWX2LM6RJcTWZXU0ueO2VhOvjxQ7jxv9NfhsUm2QyOau9YgRk3XIWFLxKEa_B48z4s9FvY89HMBlXkqtlxwX2_eAp6fSpXL_Vg8B5cujdCtP_hB3orJFndUTn2Rs4FtABVrAMyPXvVVBI3Xoe705Pb4_Ow6bgQYqxE4bEyOTMutl3r90GEoUHnU7YYSUPAiNzq1H-mLo1Q5k6hdUKiFIIhRl3jGNuAueFo6DYhUNxwI7VAjcilkYaEz5iILBPaPzhpQTTBIMNGjpy6YgyyalsSqYxgywi2rIGtBftTk5dai-OvwWuEwHRg8_Jb0P4G9PR-nAq_GVZi63e7bZiPq0YYRP3bgbnitXS7Ph0pTLvyw0-ajd5c
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1BT9swFH4CdgAOsA0QZYPlsNNEShrbiX0EBOug5UIRHJAi-9nRqlUtahMJ-PX4JWkFaJp2ixS_KPHn-D3b3_sewPdImVhpqUOrOA95nppQRzYJpU1ZjIxLFJTg3L9Kujf84k7cLcHhIhfGOVeRz1ybLquzfDvBkrbKjhKuKCRZhg_e73NRZ2vN513OEl6rowpJIoxJc4bZidTR4Pi6RzQu1vbDl-RM3nihqqwKkSL1zPdLXhe0eOVlzjehP3-_mlzyp10Wpo3P76Qb__cDPsJGE24Gx_X4-ARLbvwZ1l-JEG7B_fXTuNCP4Yn3ZzaofFfNjwtuh8Xv4GRICeulHo2egp57JEyHz76ht0IS1p2UM2_kXEBTUMU7INOfU01JcbNtuDk_G5x2w6bmQoixEoVHy-TMuNh2rF8JEYoGnQ_aYiQVASNyq1P_o7o0Qpk7hdYJiVIIhhh1jGNsB1bGk7HbhUBxw43UAjUil0Yakj5jIrJMaP_gpAXRHIMMG0FyqosxyqqFSaQygi0j2LIGthb8WJg81Goc_2q8RQgsGjad34KDN0Av7sep8MthJfb-bvcNVruDfi_r_bq6_AJrcVUWg4iAX2GlmJZu3wcnhTmoxuQLY9DhqQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Syntax-Based+Translation+With+Bilingually+Lexicalized+Synchronous+Tree+Substitution+Grammars&rft.jtitle=IEEE+transactions+on+audio%2C+speech%2C+and+language+processing&rft.au=Zhang%2C+Jiajun&rft.au=Zhai%2C+Feifei&rft.au=Zong%2C+Chengqing&rft.date=2013-08-01&rft.issn=1558-7916&rft.eissn=1558-7924&rft.volume=21&rft.issue=8&rft.spage=1586&rft.epage=1597&rft_id=info:doi/10.1109%2FTASL.2013.2255283&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TASL_2013_2255283
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1558-7916&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1558-7916&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1558-7916&client=summon