Grabbing the Long Tail: A data normalization method for diverse and informative dialogue generation
Recent neural models have shown significant progress in dialogue generation. Among those models, most of them are based on language models, yielding the generation word by word according to the previous context. Due to the inherent mechanism in language models, as well as the most frequently used cr...
Saved in:
Published in | Neurocomputing (Amsterdam) Vol. 460; pp. 374 - 384 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
14.10.2021
|
Subjects | |
Online Access | Get full text |
ISSN | 0925-2312 1872-8286 |
DOI | 10.1016/j.neucom.2021.07.039 |
Cover
Abstract | Recent neural models have shown significant progress in dialogue generation. Among those models, most of them are based on language models, yielding the generation word by word according to the previous context. Due to the inherent mechanism in language models, as well as the most frequently used cross-entropy function (making the distribution of generations approximate that of training data continuously), trained generation models inevitably tend to generate frequent words in training datasets, leading to low diversity and poor informativeness issues. By investigating a few mainstream dialogue generation models, we find that the probable cause is the intrinsic Long Tail Phenomenon in linguistics. To address these issues of low diversity and poor informativeness, we explore and analyze a large corpus from Wikipedia, and then propose an efficient frequency-based data normalization method, i.e., Log Normalization. Furthermore, we explore another two methods, Mutual Normalization and Log-Mutual Normalization, to eliminate the mutual information effect. In order to validate the effectiveness of the proposed methods, we conduct extensive experiments on three datasets with different subjects, including social media, film subtitles, and online customer service. Compared with the vanilla transformers, generation models augmented with our proposed methods achieve significant improvements in generated responses, in terms of both diversity and informativeness. Specifically, the unigram and bigram diversity in the responses are improved by 8.5%–14.1% and 19.7%–25.8% on the three datasets, respectively. The informativeness (defined as amounts of nouns and verbs) is increased by 13.1%–31.0% and 30.4%–59.0%, respectively. Moreover, our methods can be adapted to new generation models efficiently and effectively, with their model-agnostic characteristics. |
---|---|
AbstractList | Recent neural models have shown significant progress in dialogue generation. Among those models, most of them are based on language models, yielding the generation word by word according to the previous context. Due to the inherent mechanism in language models, as well as the most frequently used cross-entropy function (making the distribution of generations approximate that of training data continuously), trained generation models inevitably tend to generate frequent words in training datasets, leading to low diversity and poor informativeness issues. By investigating a few mainstream dialogue generation models, we find that the probable cause is the intrinsic Long Tail Phenomenon in linguistics. To address these issues of low diversity and poor informativeness, we explore and analyze a large corpus from Wikipedia, and then propose an efficient frequency-based data normalization method, i.e., Log Normalization. Furthermore, we explore another two methods, Mutual Normalization and Log-Mutual Normalization, to eliminate the mutual information effect. In order to validate the effectiveness of the proposed methods, we conduct extensive experiments on three datasets with different subjects, including social media, film subtitles, and online customer service. Compared with the vanilla transformers, generation models augmented with our proposed methods achieve significant improvements in generated responses, in terms of both diversity and informativeness. Specifically, the unigram and bigram diversity in the responses are improved by 8.5%–14.1% and 19.7%–25.8% on the three datasets, respectively. The informativeness (defined as amounts of nouns and verbs) is increased by 13.1%–31.0% and 30.4%–59.0%, respectively. Moreover, our methods can be adapted to new generation models efficiently and effectively, with their model-agnostic characteristics. |
Author | Wang, Qianying Zhang, Yang Zhang, Liuxin Zhan, Zhiqiang Zhao, Jianyu Gong, Jiangtao Shen, Qi |
Author_xml | – sequence: 1 givenname: Zhiqiang surname: Zhan fullname: Zhan, Zhiqiang organization: Smart Education Lab, Lenovo Research, Beijing, China – sequence: 2 givenname: Jianyu surname: Zhao fullname: Zhao, Jianyu organization: AI Lab, Lenovo Research, Beijing, China – sequence: 3 givenname: Yang surname: Zhang fullname: Zhang, Yang organization: Smart Education Lab, Lenovo Research, Beijing, China – sequence: 4 givenname: Jiangtao surname: Gong fullname: Gong, Jiangtao organization: Smart Education Lab, Lenovo Research, Beijing, China – sequence: 5 givenname: Qianying surname: Wang fullname: Wang, Qianying organization: Smart Education Lab, Lenovo Research, Beijing, China – sequence: 6 givenname: Qi surname: Shen fullname: Shen, Qi organization: Beijing Union University, Beijing, China – sequence: 7 givenname: Liuxin surname: Zhang fullname: Zhang, Liuxin email: zhanglx2@lenovo.com organization: Smart Education Lab, Lenovo Research, Beijing, China |
BookMark | eNqFkM1KAzEQgINUsK2-gYe8wK5J9ie7PQilaBUKXuo55Ge2Tdkmkk0L-vSmrScPepphZr5h5pugkfMOELqnJKeE1g-73MFB-33OCKM54Tkp2is0pg1nWcOaeoTGpGVVxgrKbtBkGHaEUE5ZO0Z6GaRS1m1w3AJe-ZSspe1neI6NjBI7H_ayt18yWu_wHuLWG9z5gI09QhgAS2ewdd1pLKZSqsvebw6AN-AgnLFbdN3JfoC7nzhF789P68VLtnpbvi7mq0wXpI5Zp0zNm5JIDqTlBS_bmhacV4py3ilScl1UqUEUJV0pa9MY3TLojDKqVRVTxRSVl706-GEI0ImPYPcyfApKxEmU2ImLKHESJQgXSVTCZr8wbeP58BiSiv_gxwsM6bGjhSAGbcFpMDaAjsJ4-_eCb6OTik0 |
CitedBy_id | crossref_primary_10_2478_amns_2021_2_00209 crossref_primary_10_1016_j_asoc_2023_110909 crossref_primary_10_1016_j_ins_2023_120017 crossref_primary_10_1016_j_neucom_2024_127735 crossref_primary_10_1155_2021_8495288 crossref_primary_10_1016_j_bdr_2023_100394 crossref_primary_10_2139_ssrn_4076949 crossref_primary_10_1016_j_neunet_2024_106794 crossref_primary_10_1016_j_psep_2023_09_069 crossref_primary_10_1016_j_segan_2022_100869 |
Cites_doi | 10.18653/v1/2020.acl-main.7 10.3115/v1/D14-1179 10.18653/v1/2020.acl-main.9 10.18653/v1/P16-1094 10.18653/v1/N19-1125 10.3115/v1/N15-1020 10.3115/1073083.1073135 10.1109/ACCESS.2019.2900753 10.1016/j.knosys.2018.11.004 10.18653/v1/D17-1235 10.18653/v1/2020.emnlp-main.147 10.3233/SW-180338 10.1145/3343031.3350923 10.18653/v1/2020.acl-main.6 10.18653/v1/D18-1428 10.18653/v1/P16-1154 10.1162/neco.1997.9.8.1735 10.1109/IJCNN.2019.8851960 10.24963/ijcai.2018/606 10.18653/v1/2020.acl-main.54 10.1016/j.eswa.2019.112887 10.24963/ijcai.2018/643 10.1145/584091.584093 10.18653/v1/2020.acl-main.131 |
ContentType | Journal Article |
Copyright | 2021 Elsevier B.V. |
Copyright_xml | – notice: 2021 Elsevier B.V. |
DBID | AAYXX CITATION |
DOI | 10.1016/j.neucom.2021.07.039 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISSN | 1872-8286 |
EndPage | 384 |
ExternalDocumentID | 10_1016_j_neucom_2021_07_039 S0925231221010833 |
GroupedDBID | --- --K --M .DC .~1 0R~ 123 1B1 1~. 1~5 4.4 457 4G. 53G 5VS 7-5 71M 8P~ 9JM 9JN AABNK AACTN AADPK AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAXLA AAXUO AAYFN ABBOA ABCQJ ABFNM ABJNI ABMAC ABYKQ ACDAQ ACGFS ACRLP ACZNC ADBBV ADEZE AEBSH AEKER AENEX AFKWA AFTJW AFXIZ AGHFR AGUBO AGWIK AGYEJ AHHHB AHZHX AIALX AIEXJ AIKHN AITUG AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD AXJTR BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ IHE J1W KOM LG9 M41 MO0 MOBAO N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 ROL RPZ SDF SDG SDP SES SPC SPCBC SSN SSV SSZ T5K ZMT ~G- 29N AAQXK AATTM AAXKI AAYWO AAYXX ABWVN ABXDB ACNNM ACRPL ACVFH ADCNI ADJOM ADMUD ADNMO AEIPS AEUPX AFJKZ AFPUW AGCQF AGQPQ AGRNS AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP ASPBG AVWKF AZFZN BNPGV CITATION EJD FEDTE FGOYB HLZ HVGLF HZ~ R2- RIG SBC SEW SSH WUQ XPP |
ID | FETCH-LOGICAL-c306t-fbd67840a7e0973749613775b177fb047c350970b10f4a6d8dc92efdbdb9b52b3 |
IEDL.DBID | AIKHN |
ISSN | 0925-2312 |
IngestDate | Tue Jul 01 04:24:42 EDT 2025 Thu Apr 24 23:06:47 EDT 2025 Fri Feb 23 02:44:31 EST 2024 |
IsPeerReviewed | true |
IsScholarly | true |
Keywords | Long Tail Informativeness Dialogue generation Data normalization Diversity |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c306t-fbd67840a7e0973749613775b177fb047c350970b10f4a6d8dc92efdbdb9b52b3 |
PageCount | 11 |
ParticipantIDs | crossref_primary_10_1016_j_neucom_2021_07_039 crossref_citationtrail_10_1016_j_neucom_2021_07_039 elsevier_sciencedirect_doi_10_1016_j_neucom_2021_07_039 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2021-10-14 |
PublicationDateYYYYMMDD | 2021-10-14 |
PublicationDate_xml | – month: 10 year: 2021 text: 2021-10-14 day: 14 |
PublicationDecade | 2020 |
PublicationTitle | Neurocomputing (Amsterdam) |
PublicationYear | 2021 |
Publisher | Elsevier B.V |
Publisher_xml | – name: Elsevier B.V |
References | J. Xu, X. Ren, J. Lin, X. Sun, Diversity-promoting GAN: A cross-entropy based generative adversarial network for diversified text generation, in: E. Riloff, D. Chiang, J. Hockenmaier, J. Tsujii (Eds.), Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, Association for Computational Linguistics, 2018, pp. 3940–3949. Shannon (b0185) 2001; 5 C.-S. Wu, S.C. Hoi, R. Socher, C. Xiong, TOD-BERT: Pre-trained natural language understanding for task-oriented dialogue, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, 2020, pp. 917–929. doi:10.18653/v1/2020.emnlp-main.66. URL https://www.aclweb.org/anthology/2020.emnlp-main.66 Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin (b0155) 2017 K. Papineni, S. Roukos, T. Ward, W. Zhu, Bleu: a method for automatic evaluation of machine translation, in: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, July 6-12, 2002, Philadelphia, PA, USA., 2002, pp. 311–318. URL http://www.aclweb.org/anthology/P02-1040.pdf Zhang, Luo (b0120) 2019; 10 K. Xu, J. Ba, R. Kiros, K. Cho, A.C. Courville, R. Salakhutdinov, R.S. Zemel, Y. Bengio, Show, attend and tell: Neural image caption generation with visual attention, in: Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6–11 July 2015, 2015, pp. 2048–2057. B. Wu, M. Li, Z. Wang, Y. Chen, D.F. Wong, Q. Feng, J. Huang, B. Wang, Guiding variational response generator to exploit persona, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 53–65. doi:10.18653/v1/2020.acl-main.7. URL: https://www.aclweb.org/anthology/2020.acl-main.7. J. Li, M. Galley, C. Brockett, G.P. Spithourakis, J. Gao, W.B. Dolan, A persona-based neural conversation model, in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7–12, 2016, Berlin, Germany, Volume 1: Long Papers, Association for Computational Linguistics, Berlin, Germany, 2016, pp. 994–1003. doi:10.18653/v1/p16-1094. https://doi.org/10.18653/v1/p16-1094. Wu, Li, Zhang, Zhou, Wu (b0035) 2020 O. Vinyals, Q.V. Le, A neural conversational model, CoRR abs/1506.05869. arXiv:1506.05869. URL http://arxiv.org/abs/1506.05869 Q. Liu, Y. Chen, B. Chen, J.-G. Lou, Z. Chen, B. Zhou, D. Zhang, You impress me: Dialogue generation via mutual persona perception, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 1417–1427. doi:10.18653/v1/2020.acl-main.131. URL https://www.aclweb.org/anthology/2020.acl-main.131 D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly learning to align and translate, in: Y. Bengio, Y. LeCun (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings, 2015. J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional transformers for language understanding, in: J. Burstein, C. Doran, T. Solorio (Eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers), Association for Computational Linguistics, 2019, pp. 4171–4186. https://aclweb.org/anthology/papers/N/N19/N19-1423/. Hamedani, Kaedi (b0125) 2019; 164 Y. Shao, S. Gouws, D. Britz, A. Goldie, B. Strope, R. Kurzweil, Generating high-quality and informative conversation responses with sequence-to-sequence models, in: M. Palmer, R. Hwa, S. Riedel (Eds.), Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9–11, 2017, Association for Computational Linguistics, 2017, pp. 2210–2219. X. Gao, S. Lee, Y. Zhang, C. Brockett, M. Galley, J. Gao, B. Dolan, Jointly optimizing diversity and relevance in neural response generation, in: J. Burstein, C. Doran, T. Solorio (Eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers), Association for Computational Linguistics, 2019, pp. 1229–1238. X. Lin, W. Jian, J. He, T. Wang, W. Chu, Generating informative conversational response using recurrent knowledge-interaction and knowledge-copy, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 41–52. doi:10.18653/v1/2020.acl-main.6. https://www.aclweb.org/anthology/2020.acl-main.6. S. Bao, H. He, F. Wang, H. Wu, H. Wang, PLATO: Pre-trained dialogue generation model with discrete latent variable, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 85–96. doi:10.18653/v1/2020.acl-main.9. https://www.aclweb.org/anthology/2020.acl-main.9. J. Gu, Z. Lu, H. Li, V.O. Li, Incorporating copying mechanism in sequence-to-sequence learning, in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, Germany, 2016, pp. 1631–1640. doi:10.18653/v1/P16-1154. https://www.aclweb.org/anthology/P16-1154. D.P. Kingma, J. Ba, Adam: A method for stochastic optimization, in: Y. Bengio, Y. LeCun (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings, 2015. I. Sutskever, O. Vinyals, Q.V. Le, Sequence to sequence learning with neural networks, in: Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8–13 2014, Montreal, Quebec, Canada, 2014, pp. 3104–3112. I.V. Serban, A. Sordoni, Y. Bengio, A.C. Courville, J. Pineau, Building end-to-end dialogue systems using generative hierarchical neural network models, in: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12–17, 2016, Phoenix, Arizona, USA., 2016, pp. 3776–3784. http://www.aaai.org/ocs/index.php/AAAI/ AAAI16/paper/view/11957. J. Chung, Ç. Gülçehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling, CoRR abs/1412.3555. URL http://arxiv.org/abs/1412.3555 A. Ritter, C. Cherry, W.B. Dolan, Data-driven response generation in social media, in: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, 27–31 July 2011, John McIntyre Conference Centre, Edinburgh, UK, A meeting of SIGDAT, a Special Interest Group of the ACL, 2011, pp. 583–593. http://www.aclweb.org/anthology/D11-1054. Z. Zhan, Z. Hou, Q. Yang, J. Zhao, Y. Zhang, C. Hu, SSA: A more humanized automatic evaluation method for open dialogue generation, in: International Joint Conference on Neural Networks, IJCNN 2019 Budapest, Hungary, July 14–19, 2019, IEEE, 2019, pp. 1–8. doi:10.1109/IJCNN.2019.8851960. doi: 10.1109/IJCNN.2019.8851960. S. Yang, R. Zhang, S. Erfani, GraphDialog: Integrating graph knowledge into end-to-end task-oriented dialogue systems, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, 2020, pp. 1878–1888. doi:10.18653/v1/2020.emnlp-main.147. URL: https://www.aclweb.org/anthology/2020.emnlp-main.147. Shao, Guo, Chen, Hao (b0160) 2019; 7 A. Sordoni, M. Galley, M. Auli, C. Brockett, Y. Ji, M. Mitchell, J. Nie, J. Gao, B. Dolan, A neural network approach to context-sensitive generation of conversational responses, in: HLT-NAACL, The Association for Computational Linguistics, 2015, pp. 196–205. J. Li, M. Galley, C. Brockett, J. Gao, B. Dolan, A diversity-promoting objective function for neural conversation models, in: NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12–17, 2016, 2016, pp. 110–119. http://aclweb.org/anthology/N/N16/N16-1014.pdf. Hochreiter, Schmidhuber (b0135) 1997; 9 Z. Shi, X. Chen, X. Qiu, X. Huang, Toward diverse text generation with inverse reinforcement learning, in: J. Lang (Ed.), Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13–19, 2018, Stockholm, Sweden., ijcai.org, 2018, pp. 4361–4367. K. Cho, B. van Merrienboer, Ç. Gülçehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine translation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, 2014, pp. 1724–1734. http://aclweb.org/anthology/D/D14/D14-1179.pdf. D. Ham, J.-G. Lee, Y. Jang, K.-E. Kim, End-to-end neural pipeline for goal-oriented dialogue systems using GPT-2, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 583–592. doi:10.18653/v1/2020.acl-main.54. URL https://www.aclweb.org/anthology/2020.acl-main.54. H. Zhou, T. Young, M. Huang, H. Zhao, J. Xu, X. Zhu, Commonsense knowledge aware conversation generation with graph attention, in: J. Lang (Ed.), Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13–19, 2018, Stockho 10.1016/j.neucom.2021.07.039_b0070 10.1016/j.neucom.2021.07.039_b0170 10.1016/j.neucom.2021.07.039_b0050 10.1016/j.neucom.2021.07.039_b0095 10.1016/j.neucom.2021.07.039_b0150 10.1016/j.neucom.2021.07.039_b0030 10.1016/j.neucom.2021.07.039_b0075 10.1016/j.neucom.2021.07.039_b0130 10.1016/j.neucom.2021.07.039_b0010 10.1016/j.neucom.2021.07.039_b0175 10.1016/j.neucom.2021.07.039_b0055 10.1016/j.neucom.2021.07.039_b0110 10.1016/j.neucom.2021.07.039_b0015 10.1016/j.neucom.2021.07.039_b0115 Wu (10.1016/j.neucom.2021.07.039_b0035) 2020 Vaswani (10.1016/j.neucom.2021.07.039_b0155) 2017 Hamedani (10.1016/j.neucom.2021.07.039_b0125) 2019; 164 10.1016/j.neucom.2021.07.039_b0090 10.1016/j.neucom.2021.07.039_b0080 10.1016/j.neucom.2021.07.039_b0180 10.1016/j.neucom.2021.07.039_b0060 Zhang (10.1016/j.neucom.2021.07.039_b0120) 2019; 10 Shannon (10.1016/j.neucom.2021.07.039_b0185) 2001; 5 10.1016/j.neucom.2021.07.039_b0040 10.1016/j.neucom.2021.07.039_b0085 10.1016/j.neucom.2021.07.039_b0140 10.1016/j.neucom.2021.07.039_b0020 10.1016/j.neucom.2021.07.039_b0065 10.1016/j.neucom.2021.07.039_b0165 10.1016/j.neucom.2021.07.039_b0045 10.1016/j.neucom.2021.07.039_b0100 10.1016/j.neucom.2021.07.039_b0145 10.1016/j.neucom.2021.07.039_b0025 10.1016/j.neucom.2021.07.039_b0005 10.1016/j.neucom.2021.07.039_b0105 Hochreiter (10.1016/j.neucom.2021.07.039_b0135) 1997; 9 Shao (10.1016/j.neucom.2021.07.039_b0160) 2019; 7 |
References_xml | – reference: J. Li, M. Galley, C. Brockett, G.P. Spithourakis, J. Gao, W.B. Dolan, A persona-based neural conversation model, in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7–12, 2016, Berlin, Germany, Volume 1: Long Papers, Association for Computational Linguistics, Berlin, Germany, 2016, pp. 994–1003. doi:10.18653/v1/p16-1094. https://doi.org/10.18653/v1/p16-1094. – reference: K. Cho, B. van Merrienboer, Ç. Gülçehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine translation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, 2014, pp. 1724–1734. http://aclweb.org/anthology/D/D14/D14-1179.pdf. – reference: B. Wu, M. Li, Z. Wang, Y. Chen, D.F. Wong, Q. Feng, J. Huang, B. Wang, Guiding variational response generator to exploit persona, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 53–65. doi:10.18653/v1/2020.acl-main.7. URL: https://www.aclweb.org/anthology/2020.acl-main.7. – reference: Q. Liu, Y. Chen, B. Chen, J.-G. Lou, Z. Chen, B. Zhou, D. Zhang, You impress me: Dialogue generation via mutual persona perception, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 1417–1427. doi:10.18653/v1/2020.acl-main.131. URL https://www.aclweb.org/anthology/2020.acl-main.131 – reference: Z. Shi, X. Chen, X. Qiu, X. Huang, Toward diverse text generation with inverse reinforcement learning, in: J. Lang (Ed.), Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13–19, 2018, Stockholm, Sweden., ijcai.org, 2018, pp. 4361–4367. – reference: K. Xu, J. Ba, R. Kiros, K. Cho, A.C. Courville, R. Salakhutdinov, R.S. Zemel, Y. Bengio, Show, attend and tell: Neural image caption generation with visual attention, in: Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6–11 July 2015, 2015, pp. 2048–2057. – reference: J. Chung, Ç. Gülçehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling, CoRR abs/1412.3555. URL http://arxiv.org/abs/1412.3555 – reference: X. Lin, W. Jian, J. He, T. Wang, W. Chu, Generating informative conversational response using recurrent knowledge-interaction and knowledge-copy, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 41–52. doi:10.18653/v1/2020.acl-main.6. https://www.aclweb.org/anthology/2020.acl-main.6. – reference: I. Sutskever, O. Vinyals, Q.V. Le, Sequence to sequence learning with neural networks, in: Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8–13 2014, Montreal, Quebec, Canada, 2014, pp. 3104–3112. – reference: A. Sordoni, M. Galley, M. Auli, C. Brockett, Y. Ji, M. Mitchell, J. Nie, J. Gao, B. Dolan, A neural network approach to context-sensitive generation of conversational responses, in: HLT-NAACL, The Association for Computational Linguistics, 2015, pp. 196–205. – reference: K. Papineni, S. Roukos, T. Ward, W. Zhu, Bleu: a method for automatic evaluation of machine translation, in: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, July 6-12, 2002, Philadelphia, PA, USA., 2002, pp. 311–318. URL http://www.aclweb.org/anthology/P02-1040.pdf – reference: J. Gu, Z. Lu, H. Li, V.O. Li, Incorporating copying mechanism in sequence-to-sequence learning, in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, Germany, 2016, pp. 1631–1640. doi:10.18653/v1/P16-1154. https://www.aclweb.org/anthology/P16-1154. – reference: J. Xu, X. Ren, J. Lin, X. Sun, Diversity-promoting GAN: A cross-entropy based generative adversarial network for diversified text generation, in: E. Riloff, D. Chiang, J. Hockenmaier, J. Tsujii (Eds.), Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, Association for Computational Linguistics, 2018, pp. 3940–3949. – reference: D.P. Kingma, J. Ba, Adam: A method for stochastic optimization, in: Y. Bengio, Y. LeCun (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings, 2015. – reference: J. Li, M. Galley, C. Brockett, J. Gao, B. Dolan, A diversity-promoting objective function for neural conversation models, in: NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12–17, 2016, 2016, pp. 110–119. http://aclweb.org/anthology/N/N16/N16-1014.pdf. – reference: I.V. Serban, A. Sordoni, Y. Bengio, A.C. Courville, J. Pineau, Building end-to-end dialogue systems using generative hierarchical neural network models, in: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12–17, 2016, Phoenix, Arizona, USA., 2016, pp. 3776–3784. http://www.aaai.org/ocs/index.php/AAAI/ AAAI16/paper/view/11957. – volume: 9 start-page: 1735 year: 1997 end-page: 1780 ident: b0135 article-title: Long short-term memory publication-title: Neural Computation – reference: A. Ritter, C. Cherry, W.B. Dolan, Data-driven response generation in social media, in: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, 27–31 July 2011, John McIntyre Conference Centre, Edinburgh, UK, A meeting of SIGDAT, a Special Interest Group of the ACL, 2011, pp. 583–593. http://www.aclweb.org/anthology/D11-1054. – volume: 5 start-page: 3 year: 2001 end-page: 55 ident: b0185 article-title: A mathematical theory of communication publication-title: Mobile Computing and Communications Review – reference: D. Ham, J.-G. Lee, Y. Jang, K.-E. Kim, End-to-end neural pipeline for goal-oriented dialogue systems using GPT-2, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 583–592. doi:10.18653/v1/2020.acl-main.54. URL https://www.aclweb.org/anthology/2020.acl-main.54. – reference: H. Zhou, T. Young, M. Huang, H. Zhao, J. Xu, X. Zhu, Commonsense knowledge aware conversation generation with graph attention, in: J. Lang (Ed.), Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13–19, 2018, Stockholm, Sweden, ijcai.org, 2018, pp. 4623–4629. – reference: L. Nie, W. Wang, R. Hong, M. Wang, Q. Tian, Multimodal dialog system: Generating responses via adaptive decoders, in: L. Amsaleg, B. Huet, M.A. Larson, G. Gravier, H. Hung, C. Ngo, W.T. Ooi (Eds.), Proceedings of the 27th ACM International Conference on Multimedia, MM 2019, Nice, France, October 21-25, 2019, ACM, 2019, pp. 1098–1106. doi:10.1145/3343031.3350923. URL https://doi.org/10.1145/3343031.3350923 – reference: R.S. Sreepada, B.K. Patra, Mitigating long tail effect in recommendations using few shot learning technique, Expert System with Applications. 140. doi:10.1016/j.eswa.2019.112887. doi: 10.1016/j.eswa.2019.112887. – reference: S. Yang, R. Zhang, S. Erfani, GraphDialog: Integrating graph knowledge into end-to-end task-oriented dialogue systems, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, 2020, pp. 1878–1888. doi:10.18653/v1/2020.emnlp-main.147. URL: https://www.aclweb.org/anthology/2020.emnlp-main.147. – reference: J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional transformers for language understanding, in: J. Burstein, C. Doran, T. Solorio (Eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers), Association for Computational Linguistics, 2019, pp. 4171–4186. https://aclweb.org/anthology/papers/N/N19/N19-1423/. – reference: D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly learning to align and translate, in: Y. Bengio, Y. LeCun (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings, 2015. – reference: X. Gao, S. Lee, Y. Zhang, C. Brockett, M. Galley, J. Gao, B. Dolan, Jointly optimizing diversity and relevance in neural response generation, in: J. Burstein, C. Doran, T. Solorio (Eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers), Association for Computational Linguistics, 2019, pp. 1229–1238. – volume: 164 start-page: 348 year: 2019 end-page: 357 ident: b0125 article-title: Recommending the long tail items through personalized diversification publication-title: Knowledge-Based Systems – reference: Z. Zhan, Z. Hou, Q. Yang, J. Zhao, Y. Zhang, C. Hu, SSA: A more humanized automatic evaluation method for open dialogue generation, in: International Joint Conference on Neural Networks, IJCNN 2019 Budapest, Hungary, July 14–19, 2019, IEEE, 2019, pp. 1–8. doi:10.1109/IJCNN.2019.8851960. doi: 10.1109/IJCNN.2019.8851960. – reference: C.-S. Wu, S.C. Hoi, R. Socher, C. Xiong, TOD-BERT: Pre-trained natural language understanding for task-oriented dialogue, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, 2020, pp. 917–929. doi:10.18653/v1/2020.emnlp-main.66. URL https://www.aclweb.org/anthology/2020.emnlp-main.66 – reference: O. Vinyals, Q.V. Le, A neural conversational model, CoRR abs/1506.05869. arXiv:1506.05869. URL http://arxiv.org/abs/1506.05869 – volume: 7 start-page: 26146 year: 2019 end-page: 26156 ident: b0160 article-title: Transformer-based neural network for answer selection in question answering publication-title: IEEE Access – start-page: 5811 year: 2020 end-page: 5820 ident: b0035 article-title: Diverse and informative dialogue generation with context-specific commonsense knowledge awareness publication-title: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics – reference: Y. Shao, S. Gouws, D. Britz, A. Goldie, B. Strope, R. Kurzweil, Generating high-quality and informative conversation responses with sequence-to-sequence models, in: M. Palmer, R. Hwa, S. Riedel (Eds.), Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9–11, 2017, Association for Computational Linguistics, 2017, pp. 2210–2219. – volume: 10 start-page: 925 year: 2019 end-page: 945 ident: b0120 article-title: Hate speech detection: A solved problem? The challenging case of long tail on twitter publication-title: Semantic Web – reference: S. Bao, H. He, F. Wang, H. Wu, H. Wang, PLATO: Pre-trained dialogue generation model with discrete latent variable, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 85–96. doi:10.18653/v1/2020.acl-main.9. https://www.aclweb.org/anthology/2020.acl-main.9. – start-page: 6000 year: 2017 end-page: 6010 ident: b0155 article-title: Attention is all you need publication-title: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA – ident: 10.1016/j.neucom.2021.07.039_b0110 doi: 10.18653/v1/2020.acl-main.7 – ident: 10.1016/j.neucom.2021.07.039_b0095 – ident: 10.1016/j.neucom.2021.07.039_b0175 – ident: 10.1016/j.neucom.2021.07.039_b0045 – ident: 10.1016/j.neucom.2021.07.039_b0070 – ident: 10.1016/j.neucom.2021.07.039_b0065 doi: 10.3115/v1/D14-1179 – ident: 10.1016/j.neucom.2021.07.039_b0085 doi: 10.18653/v1/2020.acl-main.9 – ident: 10.1016/j.neucom.2021.07.039_b0005 – ident: 10.1016/j.neucom.2021.07.039_b0015 doi: 10.18653/v1/P16-1094 – ident: 10.1016/j.neucom.2021.07.039_b0150 – ident: 10.1016/j.neucom.2021.07.039_b0030 doi: 10.18653/v1/N19-1125 – ident: 10.1016/j.neucom.2021.07.039_b0050 doi: 10.3115/v1/N15-1020 – ident: 10.1016/j.neucom.2021.07.039_b0180 doi: 10.3115/1073083.1073135 – ident: 10.1016/j.neucom.2021.07.039_b0165 – volume: 7 start-page: 26146 year: 2019 ident: 10.1016/j.neucom.2021.07.039_b0160 article-title: Transformer-based neural network for answer selection in question answering publication-title: IEEE Access doi: 10.1109/ACCESS.2019.2900753 – volume: 164 start-page: 348 year: 2019 ident: 10.1016/j.neucom.2021.07.039_b0125 article-title: Recommending the long tail items through personalized diversification publication-title: Knowledge-Based Systems doi: 10.1016/j.knosys.2018.11.004 – ident: 10.1016/j.neucom.2021.07.039_b0100 doi: 10.18653/v1/D17-1235 – start-page: 5811 year: 2020 ident: 10.1016/j.neucom.2021.07.039_b0035 article-title: Diverse and informative dialogue generation with context-specific commonsense knowledge awareness – ident: 10.1016/j.neucom.2021.07.039_b0115 doi: 10.18653/v1/2020.emnlp-main.147 – volume: 10 start-page: 925 issue: 5 year: 2019 ident: 10.1016/j.neucom.2021.07.039_b0120 article-title: Hate speech detection: A solved problem? The challenging case of long tail on twitter publication-title: Semantic Web doi: 10.3233/SW-180338 – ident: 10.1016/j.neucom.2021.07.039_b0075 doi: 10.1145/3343031.3350923 – ident: 10.1016/j.neucom.2021.07.039_b0140 – ident: 10.1016/j.neucom.2021.07.039_b0040 doi: 10.18653/v1/2020.acl-main.6 – ident: 10.1016/j.neucom.2021.07.039_b0105 doi: 10.18653/v1/D18-1428 – ident: 10.1016/j.neucom.2021.07.039_b0010 doi: 10.18653/v1/P16-1154 – volume: 9 start-page: 1735 issue: 8 year: 1997 ident: 10.1016/j.neucom.2021.07.039_b0135 article-title: Long short-term memory publication-title: Neural Computation doi: 10.1162/neco.1997.9.8.1735 – ident: 10.1016/j.neucom.2021.07.039_b0060 doi: 10.1109/IJCNN.2019.8851960 – ident: 10.1016/j.neucom.2021.07.039_b0170 – ident: 10.1016/j.neucom.2021.07.039_b0025 doi: 10.24963/ijcai.2018/606 – ident: 10.1016/j.neucom.2021.07.039_b0090 doi: 10.18653/v1/2020.acl-main.54 – ident: 10.1016/j.neucom.2021.07.039_b0130 doi: 10.1016/j.eswa.2019.112887 – ident: 10.1016/j.neucom.2021.07.039_b0020 doi: 10.24963/ijcai.2018/643 – start-page: 6000 year: 2017 ident: 10.1016/j.neucom.2021.07.039_b0155 article-title: Attention is all you need – volume: 5 start-page: 3 issue: 1 year: 2001 ident: 10.1016/j.neucom.2021.07.039_b0185 article-title: A mathematical theory of communication publication-title: Mobile Computing and Communications Review doi: 10.1145/584091.584093 – ident: 10.1016/j.neucom.2021.07.039_b0055 – ident: 10.1016/j.neucom.2021.07.039_b0145 – ident: 10.1016/j.neucom.2021.07.039_b0080 doi: 10.18653/v1/2020.acl-main.131 |
SSID | ssj0017129 |
Score | 2.3865654 |
Snippet | Recent neural models have shown significant progress in dialogue generation. Among those models, most of them are based on language models, yielding the... |
SourceID | crossref elsevier |
SourceType | Enrichment Source Index Database Publisher |
StartPage | 374 |
SubjectTerms | Data normalization Dialogue generation Diversity Informativeness Long Tail |
Title | Grabbing the Long Tail: A data normalization method for diverse and informative dialogue generation |
URI | https://dx.doi.org/10.1016/j.neucom.2021.07.039 |
Volume | 460 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEB60Xrz4Fp9lD17XbvaRNN6KqPV50UJvIfuSSkmL1Ku_3Z1kIwqi4CUkIQNhspn5JvnmG4ATq2TqvexTE7A1lbbMqLbCUiWUVl44buo-7vuHdDiSN2M1XoLzthcGaZUx9jcxvY7W8UwverM3n0x6jyznoYpKeChakgAkxDKscJGnqgMrg-vb4cPnz4Qs4Y3kHlcUDdoOuprmVbk3pI3wkOtqFU-cGv5ThvqSdS43YC3CRTJo7mgTlly1BevtKAYS38xtMFevpQ5F7jMJgI7czcLOUzmZnpEBQQ4oqRCaTmPPJWnGRpOAV4mteRmOlJUlUUQVAyDBfhL8qkOea11qNNuB0eXF0_mQxvkJ1IRCYEG9tiEVSVZmDkV5MpmnqC-oUHHKayYzIwJcyJhOmJdlavvW5Nx5q63OteJa7EKnmlVuDwjzacpdoo1nfRk2WhktUHrOKFUy7vZBtD4rTBQXxxkX06Jlkb0UjacL9HTBsiJ4eh_op9W8Edf44_qsfRzFt0VShPj_q-XBvy0PYRWPMF0l8gg6i9c3dxxwyEJ3Yfn0PenG1fYBFmfeNA |
linkProvider | Elsevier |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF58HPTiW3y7B69rN_vItt5E1KptL1bobcm-pFJikXr1t7uTbIqCKHgJIdmBMNnMfBO--QahMydFHoJoExuxNRGuUMQ47ojk0sjAPbNVH3d_kHefxP1IjhbQVdMLA7TKFPvrmF5F63SllbzZmo7HrUfaYbGKylgsWrIIJPgiWhaSK-D1nX_MeR6ZylgtuMckgeVN_1xF8ir9O5BGWMx0lYYnzAz_KT99yTk3G2gtgUV8WT_PJlrw5RZabwYx4PRdbiN7-1aYWOI-4wjncO81ngyL8eQCX2JggOISgOkkdVziemg0jmgVu4qV4XFROpwkVCH8YegmgX86-LlSpQazHfR0cz286pI0PYHYWAbMSDAuJiJBC-VBkkeJTg7qghL0poKhQlkewYKiJqNBFLlrO9thPjjjTMdIZvguWipfS7-HMA15znxmbKBtEQ9GWsNBeM5KWVDm9xFvfKZtkhaHCRcT3XDIXnTtaQ2e1lTp6Ol9ROZW01pa44_1qnkd-tsW0TH6_2p58G_LU7TSHfZ7unc3eDhEq3AHElcmjtDS7O3dH0dEMjMn1Y77BPb13v8 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Grabbing+the+Long+Tail%3A+A+data+normalization+method+for+diverse+and+informative+dialogue+generation&rft.jtitle=Neurocomputing+%28Amsterdam%29&rft.au=Zhan%2C+Zhiqiang&rft.au=Zhao%2C+Jianyu&rft.au=Zhang%2C+Yang&rft.au=Gong%2C+Jiangtao&rft.date=2021-10-14&rft.pub=Elsevier+B.V&rft.issn=0925-2312&rft.eissn=1872-8286&rft.volume=460&rft.spage=374&rft.epage=384&rft_id=info:doi/10.1016%2Fj.neucom.2021.07.039&rft.externalDocID=S0925231221010833 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0925-2312&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0925-2312&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0925-2312&client=summon |