Syntax-Based Translation With Bilingually Lexicalized Synchronous Tree Substitution Grammars
Syntax-based models can significantly improve the translation performance due to their grammatical modeling on one or both language side(s). However, the translation rules such as the non-lexical rule " VP→(x 0 x 1 ,VP:x 1 PP:x 0 )" in string-to-tree models do not consider any lexicalized...
Saved in:
Published in | IEEE transactions on audio, speech, and language processing Vol. 21; no. 8; pp. 1586 - 1597 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Piscataway, NJ
IEEE
01.08.2013
Institute of Electrical and Electronics Engineers |
Subjects | |
Online Access | Get full text |
ISSN | 1558-7916 1558-7924 |
DOI | 10.1109/TASL.2013.2255283 |
Cover
Abstract | Syntax-based models can significantly improve the translation performance due to their grammatical modeling on one or both language side(s). However, the translation rules such as the non-lexical rule " VP→(x 0 x 1 ,VP:x 1 PP:x 0 )" in string-to-tree models do not consider any lexicalized information on the source or target side. The rule is so generalized that any subtree rooted at VP can substitute for the nonterminal VP:x 1 . Because rules containing nonterminals are frequently used when generating the target-side tree structures, there is a risk that rules of this type will potentially be severely misused in decoding due to a lack of lexicalization guidance. In this article, inspired by lexicalized PCFG, which is widely used in monolingual parsing, we propose to upgrade the STSG (synchronous tree substitution grammars)-based syntax translation model with bilingually lexicalized STSG. Using the string-to-tree translation model as a case study, we present generative and discriminative models to integrate lexicalized STSG into the translation model. Both small- and large-scale experiments on Chinese-to-English translation demonstrate that the proposed lexicalized STSG can provide superior rule selection in decoding and substantially improve the translation quality. |
---|---|
AbstractList | Syntax-based models can significantly improve the translation performance due to their grammatical modeling on one or both language side(s). However, the translation rules such as the non-lexical rule " VP→(x 0 x 1 ,VP:x 1 PP:x 0 )" in string-to-tree models do not consider any lexicalized information on the source or target side. The rule is so generalized that any subtree rooted at VP can substitute for the nonterminal VP:x 1 . Because rules containing nonterminals are frequently used when generating the target-side tree structures, there is a risk that rules of this type will potentially be severely misused in decoding due to a lack of lexicalization guidance. In this article, inspired by lexicalized PCFG, which is widely used in monolingual parsing, we propose to upgrade the STSG (synchronous tree substitution grammars)-based syntax translation model with bilingually lexicalized STSG. Using the string-to-tree translation model as a case study, we present generative and discriminative models to integrate lexicalized STSG into the translation model. Both small- and large-scale experiments on Chinese-to-English translation demonstrate that the proposed lexicalized STSG can provide superior rule selection in decoding and substantially improve the translation quality. |
Author | Chengqing Zong Jiajun Zhang Feifei Zhai |
Author_xml | – sequence: 1 givenname: Jiajun surname: Zhang fullname: Zhang, Jiajun – sequence: 2 givenname: Feifei surname: Zhai fullname: Zhai, Feifei – sequence: 3 givenname: Chengqing surname: Zong fullname: Zong, Chengqing |
BackLink | http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=27572195$$DView record in Pascal Francis |
BookMark | eNp9kDtPwzAUhS1UJNrCD0AsWRhT_IiTeGwrKEiRGFrEgmQ5zg01SpPKdqWWX4_7oAMD073D-Y7OOQPUa7sWELoleEQIFg-L8bwYUUzYiFLOac4uUJ9wnseZoEnv_JP0Cg2c-8I4YWlC-uhjvmu92sYT5aCKFla1rlHedG30bvwympjGtJ8b1TS7qICt0aox30EYKL20XdttXIAAovmmdN74zQGdWbVaKeuu0WWtGgc3pztEb0-Pi-lzXLzOXqbjItZUcB8ndVmzEmhFKiFSoXJVakhISjXXVV7yulIZxilkWOc1CF0Bz3XOOdMakxIYG6L7o-9auZCwDi20cXJtTUixkzTjGSWCB1121GnbOWehltr4Q1tvlWkkwXI_ptyPKfdjytOYgSR_yF_z_5i7I2MA4KxPE4ExEewHJuyEuQ |
CODEN | ITASD8 |
CitedBy_id | crossref_primary_10_1145_2699927 crossref_primary_10_3390_math10060914 |
Cites_doi | 10.3115/1699571.1699607 10.3115/1626431.1626459 10.1145/2025384.2025386 10.3115/1075178.1075217 10.3115/1219840.1219899 10.3115/1220175.1220230 10.3115/1073012.1073079 10.3115/1620754.1620786 10.3115/1220175.1220241 10.3115/1690219.1690225 10.3115/1610075.1610083 10.1162/089120103322753356 10.3115/1220175.1220252 10.3115/1219840.1219874 10.3115/1220175.1220296 10.3115/1631828.1631829 10.1162/coli.2007.33.2.201 10.3115/1220835.1220868 10.3115/1626355.1626361 10.3115/1273073.1273195 10.3115/1613715.1613745 |
ContentType | Journal Article |
Copyright | 2014 INIST-CNRS |
Copyright_xml | – notice: 2014 INIST-CNRS |
DBID | 97E RIA RIE AAYXX CITATION IQODW |
DOI | 10.1109/TASL.2013.2255283 |
DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Pascal-Francis |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Applied Sciences |
EISSN | 1558-7924 |
EndPage | 1597 |
ExternalDocumentID | 27572195 10_1109_TASL_2013_2255283 6490019 |
Genre | orig-research |
GroupedDBID | 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AASAJ AAWTH ABAZT ABQJQ ABVLG AETIX AGQYO AGSQL AHBIQ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD F5P HZ~ IFIPE IPLJI JAVBF LAI M43 O9- OCL RIA RIE RNS AAYXX CITATION RIG IQODW |
ID | FETCH-LOGICAL-c295t-4fbf3be2d1d9969a8abce4162c5cd8b5fda7006e70c8fe9cde58c8553cc01be33 |
IEDL.DBID | RIE |
ISSN | 1558-7916 |
IngestDate | Wed Apr 02 07:26:42 EDT 2025 Tue Jul 01 01:00:10 EDT 2025 Thu Apr 24 23:12:03 EDT 2025 Tue Aug 26 16:39:18 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 8 |
Keywords | Performance evaluation Discriminant analysis Bilingually lexicalized synchronous tree substitution grammars syntax-based statistical machine translation Syntactic analysis Tree structure Decoding Formal grammar Modeling Case study Guidance English Language processing Information source Automatic translation Chinese generative model Syntax discriminative model |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html CC BY 4.0 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c295t-4fbf3be2d1d9969a8abce4162c5cd8b5fda7006e70c8fe9cde58c8553cc01be33 |
PageCount | 12 |
ParticipantIDs | crossref_citationtrail_10_1109_TASL_2013_2255283 crossref_primary_10_1109_TASL_2013_2255283 ieee_primary_6490019 pascalfrancis_primary_27572195 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2013-08-01 |
PublicationDateYYYYMMDD | 2013-08-01 |
PublicationDate_xml | – month: 08 year: 2013 text: 2013-08-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | Piscataway, NJ |
PublicationPlace_xml | – name: Piscataway, NJ |
PublicationTitle | IEEE transactions on audio, speech, and language processing |
PublicationTitleAbbrev | TASL |
PublicationYear | 2013 |
Publisher | IEEE Institute of Electrical and Electronics Engineers |
Publisher_xml | – name: IEEE – name: Institute of Electrical and Electronics Engineers |
References | ref13 liu (ref15) 2006 ref12 zhai (ref38) 2012 ref37 shen (ref22) 2008 ref31 ref30 och (ref18) 2004 liu (ref14) 2010 ref32 ref10 post (ref20) 2009 ref2 koehn (ref11) 2004 ref17 zhang (ref29) 2011 ref16 galley (ref8) 2006 xiong (ref26) 2006 zollmann (ref36) 2011 chiang (ref5) 2006 wu (ref23) 2010 zhang (ref34) 2004 ref24 xie (ref25) 2011 zhang (ref33) 2011 ref21 zhai (ref39) 2012 charniak (ref1) 2003 chiang (ref4) 2010 ref28 ref27 ref7 ref3 ref6 galley (ref9) 2004 zhang (ref35) 2008 petrov (ref19) 2006 |
References_xml | – start-page: 273 year: 2004 ident: ref9 article-title: What's in a translation rule publication-title: Proceedings of NAACL'04 – ident: ref37 doi: 10.3115/1699571.1699607 – start-page: 835 year: 2011 ident: ref29 article-title: Binarized forest to string translation publication-title: Proc ACL '11 – start-page: 40 year: 2003 ident: ref1 article-title: Syntax-based language models for statistical machine translation publication-title: Proc MT Summit IX – start-page: 261 year: 2012 ident: ref39 article-title: Simple but effective approaches to improving tree-to-tree model publication-title: Proc MT Summit XIII – start-page: 577 year: 2008 ident: ref22 article-title: A new string to dependency machine translation algorithm with a target dependency language model publication-title: Proc ACL-08 HLT – ident: ref12 doi: 10.3115/1626431.1626459 – start-page: 1443 year: 2010 ident: ref4 article-title: Learning to translate with source and target syntax publication-title: Proc of the 40th ACL – ident: ref24 doi: 10.1145/2025384.2025386 – year: 2004 ident: ref34 publication-title: Maximum Entropy Modeling Toolkit for Python and C++ – start-page: 22 year: 2010 ident: ref23 article-title: Effective use of function words for rule generalization in forest-based translation publication-title: Proc ACL '10 – ident: ref7 doi: 10.3115/1075178.1075217 – start-page: 216 year: 2011 ident: ref25 article-title: A novel dependency-to-string model for statistical machine translation publication-title: Proc EMNLP '11 – ident: ref31 doi: 10.3115/1219840.1219899 – start-page: 1 year: 2011 ident: ref36 article-title: A word-class approach to labeling PSCFG rules for machine translation publication-title: Proc ACL '11 – start-page: 433 year: 2006 ident: ref19 article-title: Learning accurate, compact, and interpretable tree annotation publication-title: Proc COLING-ACL '06 doi: 10.3115/1220175.1220230 – start-page: 1 year: 2009 ident: ref20 article-title: Language modeling with tree substitution grammars publication-title: Proc NISP Workshop Grammar Induct Represent of Lang Lang Learn – ident: ref28 doi: 10.3115/1073012.1073079 – ident: ref3 doi: 10.3115/1620754.1620786 – start-page: 521 year: 2006 ident: ref26 article-title: Maximum entropy based phrase reordering model for statistical machine translation publication-title: Proc COLING-ACL '06 doi: 10.3115/1220175.1220241 – start-page: 388 year: 2004 ident: ref11 article-title: Statistical significance tests for machine translation evaluation publication-title: Proc EMNLP'04 – ident: ref13 doi: 10.3115/1690219.1690225 – start-page: 204 year: 2011 ident: ref33 article-title: Augmenting string-to-tree translation models with fuzzy use of source-side syntax publication-title: Proc EMNLP '11 – year: 2006 ident: ref5 article-title: An introduction to synchronous grammars publication-title: Tutorial on ACL-06 – start-page: 3037 year: 2012 ident: ref38 article-title: Tree-based translation without using parse trees publication-title: Proc COLING '12 – ident: ref16 doi: 10.3115/1610075.1610083 – ident: ref6 doi: 10.1162/089120103322753356 – start-page: 609 year: 2006 ident: ref15 article-title: Tree-to-string alignment template for statistical machine translation publication-title: Proc COLING-ACL '06 doi: 10.3115/1220175.1220252 – ident: ref21 doi: 10.3115/1219840.1219874 – start-page: 707 year: 2010 ident: ref14 article-title: Joint parsing and translation publication-title: Proc COLING/ACL – start-page: 961 year: 2006 ident: ref8 article-title: Scalable inference and training of context-rich syntactic translation models publication-title: Proc COLING-ACL '06 doi: 10.3115/1220175.1220296 – ident: ref10 doi: 10.3115/1631828.1631829 – ident: ref2 doi: 10.1162/coli.2007.33.2.201 – start-page: 559 year: 2008 ident: ref35 article-title: A tree sequence alignment-based tree-to-tree translation model publication-title: Proc ACL-08 HLT – ident: ref30 doi: 10.3115/1220835.1220868 – start-page: 161 year: 2004 ident: ref18 article-title: A smorgasbord of features for statistical machine translation publication-title: Proceedings of NAACL'04 – ident: ref27 doi: 10.3115/1626355.1626361 – ident: ref32 doi: 10.3115/1273073.1273195 – ident: ref17 doi: 10.3115/1613715.1613745 |
SSID | ssj0043641 |
Score | 2.0746696 |
Snippet | Syntax-based models can significantly improve the translation performance due to their grammatical modeling on one or both language side(s). However, the... |
SourceID | pascalfrancis crossref ieee |
SourceType | Index Database Enrichment Source Publisher |
StartPage | 1586 |
SubjectTerms | Adaptation models Applied sciences Bilingually lexicalized synchronous tree substitution grammars Coding, codes Decoding discriminative model Exact sciences and technology generative model Grammar Information, signal and communications theory Miscellaneous Reliability Signal and communications theory Signal processing Syntactics syntax-based statistical machine translation Telecommunications and information theory Training Vegetation |
Title | Syntax-Based Translation With Bilingually Lexicalized Synchronous Tree Substitution Grammars |
URI | https://ieeexplore.ieee.org/document/6490019 |
Volume | 21 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8QwEB7Ukx58i-tj6cGT2LXbJG1ydMUHol5U9CCUZJLi4rIrbgvqrzfTdhcVEW-FZkroN81Mmm--AdiLlImVljq0ivOQ56kJdWSTUNqUxci4REEFzlfXyfkdv3gQDzNwMK2Fcc5V5DPXocvqLN-OsKRfZYcJV5SSzMKsd7O6Vmuy6nKW8FobVUiSYEyaE8xupA5vj24uicTFOt55SczkWwyqmqoQJVKP_VvJ63YWX2LM6RJcTWZXU0ueO2VhOvjxQ7jxv9NfhsUm2QyOau9YgRk3XIWFLxKEa_B48z4s9FvY89HMBlXkqtlxwX2_eAp6fSpXL_Vg8B5cujdCtP_hB3orJFndUTn2Rs4FtABVrAMyPXvVVBI3Xoe705Pb4_Ow6bgQYqxE4bEyOTMutl3r90GEoUHnU7YYSUPAiNzq1H-mLo1Q5k6hdUKiFIIhRl3jGNuAueFo6DYhUNxwI7VAjcilkYaEz5iILBPaPzhpQTTBIMNGjpy6YgyyalsSqYxgywi2rIGtBftTk5dai-OvwWuEwHRg8_Jb0P4G9PR-nAq_GVZi63e7bZiPq0YYRP3bgbnitXS7Ph0pTLvyw0-ajd5c |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1BT9swFH4CdgAOsA0QZYPlsNNEShrbiX0EBOug5UIRHJAi-9nRqlUtahMJ-PX4JWkFaJp2ixS_KPHn-D3b3_sewPdImVhpqUOrOA95nppQRzYJpU1ZjIxLFJTg3L9Kujf84k7cLcHhIhfGOVeRz1ybLquzfDvBkrbKjhKuKCRZhg_e73NRZ2vN513OEl6rowpJIoxJc4bZidTR4Pi6RzQu1vbDl-RM3nihqqwKkSL1zPdLXhe0eOVlzjehP3-_mlzyp10Wpo3P76Qb__cDPsJGE24Gx_X4-ARLbvwZ1l-JEG7B_fXTuNCP4Yn3ZzaofFfNjwtuh8Xv4GRICeulHo2egp57JEyHz76ht0IS1p2UM2_kXEBTUMU7INOfU01JcbNtuDk_G5x2w6bmQoixEoVHy-TMuNh2rF8JEYoGnQ_aYiQVASNyq1P_o7o0Qpk7hdYJiVIIhhh1jGNsB1bGk7HbhUBxw43UAjUil0Yakj5jIrJMaP_gpAXRHIMMG0FyqosxyqqFSaQygi0j2LIGthb8WJg81Goc_2q8RQgsGjad34KDN0Av7sep8MthJfb-bvcNVruDfi_r_bq6_AJrcVUWg4iAX2GlmJZu3wcnhTmoxuQLY9DhqQ |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Syntax-Based+Translation+With+Bilingually+Lexicalized+Synchronous+Tree+Substitution+Grammars&rft.jtitle=IEEE+transactions+on+audio%2C+speech%2C+and+language+processing&rft.au=Zhang%2C+Jiajun&rft.au=Zhai%2C+Feifei&rft.au=Zong%2C+Chengqing&rft.date=2013-08-01&rft.issn=1558-7916&rft.eissn=1558-7924&rft.volume=21&rft.issue=8&rft.spage=1586&rft.epage=1597&rft_id=info:doi/10.1109%2FTASL.2013.2255283&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TASL_2013_2255283 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1558-7916&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1558-7916&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1558-7916&client=summon |