Grabbing the Long Tail: A data normalization method for diverse and informative dialogue generation

Recent neural models have shown significant progress in dialogue generation. Among those models, most of them are based on language models, yielding the generation word by word according to the previous context. Due to the inherent mechanism in language models, as well as the most frequently used cr...

Full description

Saved in:
Bibliographic Details
Published inNeurocomputing (Amsterdam) Vol. 460; pp. 374 - 384
Main Authors Zhan, Zhiqiang, Zhao, Jianyu, Zhang, Yang, Gong, Jiangtao, Wang, Qianying, Shen, Qi, Zhang, Liuxin
Format Journal Article
LanguageEnglish
Published Elsevier B.V 14.10.2021
Subjects
Online AccessGet full text
ISSN0925-2312
1872-8286
DOI10.1016/j.neucom.2021.07.039

Cover

Abstract Recent neural models have shown significant progress in dialogue generation. Among those models, most of them are based on language models, yielding the generation word by word according to the previous context. Due to the inherent mechanism in language models, as well as the most frequently used cross-entropy function (making the distribution of generations approximate that of training data continuously), trained generation models inevitably tend to generate frequent words in training datasets, leading to low diversity and poor informativeness issues. By investigating a few mainstream dialogue generation models, we find that the probable cause is the intrinsic Long Tail Phenomenon in linguistics. To address these issues of low diversity and poor informativeness, we explore and analyze a large corpus from Wikipedia, and then propose an efficient frequency-based data normalization method, i.e., Log Normalization. Furthermore, we explore another two methods, Mutual Normalization and Log-Mutual Normalization, to eliminate the mutual information effect. In order to validate the effectiveness of the proposed methods, we conduct extensive experiments on three datasets with different subjects, including social media, film subtitles, and online customer service. Compared with the vanilla transformers, generation models augmented with our proposed methods achieve significant improvements in generated responses, in terms of both diversity and informativeness. Specifically, the unigram and bigram diversity in the responses are improved by 8.5%–14.1% and 19.7%–25.8% on the three datasets, respectively. The informativeness (defined as amounts of nouns and verbs) is increased by 13.1%–31.0% and 30.4%–59.0%, respectively. Moreover, our methods can be adapted to new generation models efficiently and effectively, with their model-agnostic characteristics.
AbstractList Recent neural models have shown significant progress in dialogue generation. Among those models, most of them are based on language models, yielding the generation word by word according to the previous context. Due to the inherent mechanism in language models, as well as the most frequently used cross-entropy function (making the distribution of generations approximate that of training data continuously), trained generation models inevitably tend to generate frequent words in training datasets, leading to low diversity and poor informativeness issues. By investigating a few mainstream dialogue generation models, we find that the probable cause is the intrinsic Long Tail Phenomenon in linguistics. To address these issues of low diversity and poor informativeness, we explore and analyze a large corpus from Wikipedia, and then propose an efficient frequency-based data normalization method, i.e., Log Normalization. Furthermore, we explore another two methods, Mutual Normalization and Log-Mutual Normalization, to eliminate the mutual information effect. In order to validate the effectiveness of the proposed methods, we conduct extensive experiments on three datasets with different subjects, including social media, film subtitles, and online customer service. Compared with the vanilla transformers, generation models augmented with our proposed methods achieve significant improvements in generated responses, in terms of both diversity and informativeness. Specifically, the unigram and bigram diversity in the responses are improved by 8.5%–14.1% and 19.7%–25.8% on the three datasets, respectively. The informativeness (defined as amounts of nouns and verbs) is increased by 13.1%–31.0% and 30.4%–59.0%, respectively. Moreover, our methods can be adapted to new generation models efficiently and effectively, with their model-agnostic characteristics.
Author Wang, Qianying
Zhang, Yang
Zhang, Liuxin
Zhan, Zhiqiang
Zhao, Jianyu
Gong, Jiangtao
Shen, Qi
Author_xml – sequence: 1
  givenname: Zhiqiang
  surname: Zhan
  fullname: Zhan, Zhiqiang
  organization: Smart Education Lab, Lenovo Research, Beijing, China
– sequence: 2
  givenname: Jianyu
  surname: Zhao
  fullname: Zhao, Jianyu
  organization: AI Lab, Lenovo Research, Beijing, China
– sequence: 3
  givenname: Yang
  surname: Zhang
  fullname: Zhang, Yang
  organization: Smart Education Lab, Lenovo Research, Beijing, China
– sequence: 4
  givenname: Jiangtao
  surname: Gong
  fullname: Gong, Jiangtao
  organization: Smart Education Lab, Lenovo Research, Beijing, China
– sequence: 5
  givenname: Qianying
  surname: Wang
  fullname: Wang, Qianying
  organization: Smart Education Lab, Lenovo Research, Beijing, China
– sequence: 6
  givenname: Qi
  surname: Shen
  fullname: Shen, Qi
  organization: Beijing Union University, Beijing, China
– sequence: 7
  givenname: Liuxin
  surname: Zhang
  fullname: Zhang, Liuxin
  email: zhanglx2@lenovo.com
  organization: Smart Education Lab, Lenovo Research, Beijing, China
BookMark eNqFkM1KAzEQgINUsK2-gYe8wK5J9ie7PQilaBUKXuo55Ge2Tdkmkk0L-vSmrScPepphZr5h5pugkfMOELqnJKeE1g-73MFB-33OCKM54Tkp2is0pg1nWcOaeoTGpGVVxgrKbtBkGHaEUE5ZO0Z6GaRS1m1w3AJe-ZSspe1neI6NjBI7H_ayt18yWu_wHuLWG9z5gI09QhgAS2ewdd1pLKZSqsvebw6AN-AgnLFbdN3JfoC7nzhF789P68VLtnpbvi7mq0wXpI5Zp0zNm5JIDqTlBS_bmhacV4py3ilScl1UqUEUJV0pa9MY3TLojDKqVRVTxRSVl706-GEI0ImPYPcyfApKxEmU2ImLKHESJQgXSVTCZr8wbeP58BiSiv_gxwsM6bGjhSAGbcFpMDaAjsJ4-_eCb6OTik0
CitedBy_id crossref_primary_10_2478_amns_2021_2_00209
crossref_primary_10_1016_j_asoc_2023_110909
crossref_primary_10_1016_j_ins_2023_120017
crossref_primary_10_1016_j_neucom_2024_127735
crossref_primary_10_1155_2021_8495288
crossref_primary_10_1016_j_bdr_2023_100394
crossref_primary_10_2139_ssrn_4076949
crossref_primary_10_1016_j_neunet_2024_106794
crossref_primary_10_1016_j_psep_2023_09_069
crossref_primary_10_1016_j_segan_2022_100869
Cites_doi 10.18653/v1/2020.acl-main.7
10.3115/v1/D14-1179
10.18653/v1/2020.acl-main.9
10.18653/v1/P16-1094
10.18653/v1/N19-1125
10.3115/v1/N15-1020
10.3115/1073083.1073135
10.1109/ACCESS.2019.2900753
10.1016/j.knosys.2018.11.004
10.18653/v1/D17-1235
10.18653/v1/2020.emnlp-main.147
10.3233/SW-180338
10.1145/3343031.3350923
10.18653/v1/2020.acl-main.6
10.18653/v1/D18-1428
10.18653/v1/P16-1154
10.1162/neco.1997.9.8.1735
10.1109/IJCNN.2019.8851960
10.24963/ijcai.2018/606
10.18653/v1/2020.acl-main.54
10.1016/j.eswa.2019.112887
10.24963/ijcai.2018/643
10.1145/584091.584093
10.18653/v1/2020.acl-main.131
ContentType Journal Article
Copyright 2021 Elsevier B.V.
Copyright_xml – notice: 2021 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.neucom.2021.07.039
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-8286
EndPage 384
ExternalDocumentID 10_1016_j_neucom_2021_07_039
S0925231221010833
GroupedDBID ---
--K
--M
.DC
.~1
0R~
123
1B1
1~.
1~5
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JM
9JN
AABNK
AACTN
AADPK
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAXLA
AAXUO
AAYFN
ABBOA
ABCQJ
ABFNM
ABJNI
ABMAC
ABYKQ
ACDAQ
ACGFS
ACRLP
ACZNC
ADBBV
ADEZE
AEBSH
AEKER
AENEX
AFKWA
AFTJW
AFXIZ
AGHFR
AGUBO
AGWIK
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
AXJTR
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
IHE
J1W
KOM
LG9
M41
MO0
MOBAO
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
ROL
RPZ
SDF
SDG
SDP
SES
SPC
SPCBC
SSN
SSV
SSZ
T5K
ZMT
~G-
29N
AAQXK
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ABXDB
ACNNM
ACRPL
ACVFH
ADCNI
ADJOM
ADMUD
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGCQF
AGQPQ
AGRNS
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
ASPBG
AVWKF
AZFZN
BNPGV
CITATION
EJD
FEDTE
FGOYB
HLZ
HVGLF
HZ~
R2-
RIG
SBC
SEW
SSH
WUQ
XPP
ID FETCH-LOGICAL-c306t-fbd67840a7e0973749613775b177fb047c350970b10f4a6d8dc92efdbdb9b52b3
IEDL.DBID AIKHN
ISSN 0925-2312
IngestDate Tue Jul 01 04:24:42 EDT 2025
Thu Apr 24 23:06:47 EDT 2025
Fri Feb 23 02:44:31 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Long Tail
Informativeness
Dialogue generation
Data normalization
Diversity
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c306t-fbd67840a7e0973749613775b177fb047c350970b10f4a6d8dc92efdbdb9b52b3
PageCount 11
ParticipantIDs crossref_primary_10_1016_j_neucom_2021_07_039
crossref_citationtrail_10_1016_j_neucom_2021_07_039
elsevier_sciencedirect_doi_10_1016_j_neucom_2021_07_039
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2021-10-14
PublicationDateYYYYMMDD 2021-10-14
PublicationDate_xml – month: 10
  year: 2021
  text: 2021-10-14
  day: 14
PublicationDecade 2020
PublicationTitle Neurocomputing (Amsterdam)
PublicationYear 2021
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References J. Xu, X. Ren, J. Lin, X. Sun, Diversity-promoting GAN: A cross-entropy based generative adversarial network for diversified text generation, in: E. Riloff, D. Chiang, J. Hockenmaier, J. Tsujii (Eds.), Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, Association for Computational Linguistics, 2018, pp. 3940–3949.
Shannon (b0185) 2001; 5
C.-S. Wu, S.C. Hoi, R. Socher, C. Xiong, TOD-BERT: Pre-trained natural language understanding for task-oriented dialogue, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, 2020, pp. 917–929. doi:10.18653/v1/2020.emnlp-main.66. URL https://www.aclweb.org/anthology/2020.emnlp-main.66
Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin (b0155) 2017
K. Papineni, S. Roukos, T. Ward, W. Zhu, Bleu: a method for automatic evaluation of machine translation, in: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, July 6-12, 2002, Philadelphia, PA, USA., 2002, pp. 311–318. URL http://www.aclweb.org/anthology/P02-1040.pdf
Zhang, Luo (b0120) 2019; 10
K. Xu, J. Ba, R. Kiros, K. Cho, A.C. Courville, R. Salakhutdinov, R.S. Zemel, Y. Bengio, Show, attend and tell: Neural image caption generation with visual attention, in: Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6–11 July 2015, 2015, pp. 2048–2057.
B. Wu, M. Li, Z. Wang, Y. Chen, D.F. Wong, Q. Feng, J. Huang, B. Wang, Guiding variational response generator to exploit persona, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 53–65. doi:10.18653/v1/2020.acl-main.7. URL: https://www.aclweb.org/anthology/2020.acl-main.7.
J. Li, M. Galley, C. Brockett, G.P. Spithourakis, J. Gao, W.B. Dolan, A persona-based neural conversation model, in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7–12, 2016, Berlin, Germany, Volume 1: Long Papers, Association for Computational Linguistics, Berlin, Germany, 2016, pp. 994–1003. doi:10.18653/v1/p16-1094. https://doi.org/10.18653/v1/p16-1094.
Wu, Li, Zhang, Zhou, Wu (b0035) 2020
O. Vinyals, Q.V. Le, A neural conversational model, CoRR abs/1506.05869. arXiv:1506.05869. URL http://arxiv.org/abs/1506.05869
Q. Liu, Y. Chen, B. Chen, J.-G. Lou, Z. Chen, B. Zhou, D. Zhang, You impress me: Dialogue generation via mutual persona perception, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 1417–1427. doi:10.18653/v1/2020.acl-main.131. URL https://www.aclweb.org/anthology/2020.acl-main.131
D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly learning to align and translate, in: Y. Bengio, Y. LeCun (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings, 2015.
J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional transformers for language understanding, in: J. Burstein, C. Doran, T. Solorio (Eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers), Association for Computational Linguistics, 2019, pp. 4171–4186. https://aclweb.org/anthology/papers/N/N19/N19-1423/.
Hamedani, Kaedi (b0125) 2019; 164
Y. Shao, S. Gouws, D. Britz, A. Goldie, B. Strope, R. Kurzweil, Generating high-quality and informative conversation responses with sequence-to-sequence models, in: M. Palmer, R. Hwa, S. Riedel (Eds.), Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9–11, 2017, Association for Computational Linguistics, 2017, pp. 2210–2219.
X. Gao, S. Lee, Y. Zhang, C. Brockett, M. Galley, J. Gao, B. Dolan, Jointly optimizing diversity and relevance in neural response generation, in: J. Burstein, C. Doran, T. Solorio (Eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers), Association for Computational Linguistics, 2019, pp. 1229–1238.
X. Lin, W. Jian, J. He, T. Wang, W. Chu, Generating informative conversational response using recurrent knowledge-interaction and knowledge-copy, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 41–52. doi:10.18653/v1/2020.acl-main.6. https://www.aclweb.org/anthology/2020.acl-main.6.
S. Bao, H. He, F. Wang, H. Wu, H. Wang, PLATO: Pre-trained dialogue generation model with discrete latent variable, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 85–96. doi:10.18653/v1/2020.acl-main.9. https://www.aclweb.org/anthology/2020.acl-main.9.
J. Gu, Z. Lu, H. Li, V.O. Li, Incorporating copying mechanism in sequence-to-sequence learning, in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, Germany, 2016, pp. 1631–1640. doi:10.18653/v1/P16-1154. https://www.aclweb.org/anthology/P16-1154.
D.P. Kingma, J. Ba, Adam: A method for stochastic optimization, in: Y. Bengio, Y. LeCun (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings, 2015.
I. Sutskever, O. Vinyals, Q.V. Le, Sequence to sequence learning with neural networks, in: Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8–13 2014, Montreal, Quebec, Canada, 2014, pp. 3104–3112.
I.V. Serban, A. Sordoni, Y. Bengio, A.C. Courville, J. Pineau, Building end-to-end dialogue systems using generative hierarchical neural network models, in: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12–17, 2016, Phoenix, Arizona, USA., 2016, pp. 3776–3784. http://www.aaai.org/ocs/index.php/AAAI/ AAAI16/paper/view/11957.
J. Chung, Ç. Gülçehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling, CoRR abs/1412.3555. URL http://arxiv.org/abs/1412.3555
A. Ritter, C. Cherry, W.B. Dolan, Data-driven response generation in social media, in: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, 27–31 July 2011, John McIntyre Conference Centre, Edinburgh, UK, A meeting of SIGDAT, a Special Interest Group of the ACL, 2011, pp. 583–593. http://www.aclweb.org/anthology/D11-1054.
Z. Zhan, Z. Hou, Q. Yang, J. Zhao, Y. Zhang, C. Hu, SSA: A more humanized automatic evaluation method for open dialogue generation, in: International Joint Conference on Neural Networks, IJCNN 2019 Budapest, Hungary, July 14–19, 2019, IEEE, 2019, pp. 1–8. doi:10.1109/IJCNN.2019.8851960. doi: 10.1109/IJCNN.2019.8851960.
S. Yang, R. Zhang, S. Erfani, GraphDialog: Integrating graph knowledge into end-to-end task-oriented dialogue systems, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, 2020, pp. 1878–1888. doi:10.18653/v1/2020.emnlp-main.147. URL: https://www.aclweb.org/anthology/2020.emnlp-main.147.
Shao, Guo, Chen, Hao (b0160) 2019; 7
A. Sordoni, M. Galley, M. Auli, C. Brockett, Y. Ji, M. Mitchell, J. Nie, J. Gao, B. Dolan, A neural network approach to context-sensitive generation of conversational responses, in: HLT-NAACL, The Association for Computational Linguistics, 2015, pp. 196–205.
J. Li, M. Galley, C. Brockett, J. Gao, B. Dolan, A diversity-promoting objective function for neural conversation models, in: NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12–17, 2016, 2016, pp. 110–119. http://aclweb.org/anthology/N/N16/N16-1014.pdf.
Hochreiter, Schmidhuber (b0135) 1997; 9
Z. Shi, X. Chen, X. Qiu, X. Huang, Toward diverse text generation with inverse reinforcement learning, in: J. Lang (Ed.), Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13–19, 2018, Stockholm, Sweden., ijcai.org, 2018, pp. 4361–4367.
K. Cho, B. van Merrienboer, Ç. Gülçehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine translation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, 2014, pp. 1724–1734. http://aclweb.org/anthology/D/D14/D14-1179.pdf.
D. Ham, J.-G. Lee, Y. Jang, K.-E. Kim, End-to-end neural pipeline for goal-oriented dialogue systems using GPT-2, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 583–592. doi:10.18653/v1/2020.acl-main.54. URL https://www.aclweb.org/anthology/2020.acl-main.54.
H. Zhou, T. Young, M. Huang, H. Zhao, J. Xu, X. Zhu, Commonsense knowledge aware conversation generation with graph attention, in: J. Lang (Ed.), Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13–19, 2018, Stockho
10.1016/j.neucom.2021.07.039_b0070
10.1016/j.neucom.2021.07.039_b0170
10.1016/j.neucom.2021.07.039_b0050
10.1016/j.neucom.2021.07.039_b0095
10.1016/j.neucom.2021.07.039_b0150
10.1016/j.neucom.2021.07.039_b0030
10.1016/j.neucom.2021.07.039_b0075
10.1016/j.neucom.2021.07.039_b0130
10.1016/j.neucom.2021.07.039_b0010
10.1016/j.neucom.2021.07.039_b0175
10.1016/j.neucom.2021.07.039_b0055
10.1016/j.neucom.2021.07.039_b0110
10.1016/j.neucom.2021.07.039_b0015
10.1016/j.neucom.2021.07.039_b0115
Wu (10.1016/j.neucom.2021.07.039_b0035) 2020
Vaswani (10.1016/j.neucom.2021.07.039_b0155) 2017
Hamedani (10.1016/j.neucom.2021.07.039_b0125) 2019; 164
10.1016/j.neucom.2021.07.039_b0090
10.1016/j.neucom.2021.07.039_b0080
10.1016/j.neucom.2021.07.039_b0180
10.1016/j.neucom.2021.07.039_b0060
Zhang (10.1016/j.neucom.2021.07.039_b0120) 2019; 10
Shannon (10.1016/j.neucom.2021.07.039_b0185) 2001; 5
10.1016/j.neucom.2021.07.039_b0040
10.1016/j.neucom.2021.07.039_b0085
10.1016/j.neucom.2021.07.039_b0140
10.1016/j.neucom.2021.07.039_b0020
10.1016/j.neucom.2021.07.039_b0065
10.1016/j.neucom.2021.07.039_b0165
10.1016/j.neucom.2021.07.039_b0045
10.1016/j.neucom.2021.07.039_b0100
10.1016/j.neucom.2021.07.039_b0145
10.1016/j.neucom.2021.07.039_b0025
10.1016/j.neucom.2021.07.039_b0005
10.1016/j.neucom.2021.07.039_b0105
Hochreiter (10.1016/j.neucom.2021.07.039_b0135) 1997; 9
Shao (10.1016/j.neucom.2021.07.039_b0160) 2019; 7
References_xml – reference: J. Li, M. Galley, C. Brockett, G.P. Spithourakis, J. Gao, W.B. Dolan, A persona-based neural conversation model, in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7–12, 2016, Berlin, Germany, Volume 1: Long Papers, Association for Computational Linguistics, Berlin, Germany, 2016, pp. 994–1003. doi:10.18653/v1/p16-1094. https://doi.org/10.18653/v1/p16-1094.
– reference: K. Cho, B. van Merrienboer, Ç. Gülçehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine translation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, 2014, pp. 1724–1734. http://aclweb.org/anthology/D/D14/D14-1179.pdf.
– reference: B. Wu, M. Li, Z. Wang, Y. Chen, D.F. Wong, Q. Feng, J. Huang, B. Wang, Guiding variational response generator to exploit persona, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 53–65. doi:10.18653/v1/2020.acl-main.7. URL: https://www.aclweb.org/anthology/2020.acl-main.7.
– reference: Q. Liu, Y. Chen, B. Chen, J.-G. Lou, Z. Chen, B. Zhou, D. Zhang, You impress me: Dialogue generation via mutual persona perception, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 1417–1427. doi:10.18653/v1/2020.acl-main.131. URL https://www.aclweb.org/anthology/2020.acl-main.131
– reference: Z. Shi, X. Chen, X. Qiu, X. Huang, Toward diverse text generation with inverse reinforcement learning, in: J. Lang (Ed.), Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13–19, 2018, Stockholm, Sweden., ijcai.org, 2018, pp. 4361–4367.
– reference: K. Xu, J. Ba, R. Kiros, K. Cho, A.C. Courville, R. Salakhutdinov, R.S. Zemel, Y. Bengio, Show, attend and tell: Neural image caption generation with visual attention, in: Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6–11 July 2015, 2015, pp. 2048–2057.
– reference: J. Chung, Ç. Gülçehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling, CoRR abs/1412.3555. URL http://arxiv.org/abs/1412.3555
– reference: X. Lin, W. Jian, J. He, T. Wang, W. Chu, Generating informative conversational response using recurrent knowledge-interaction and knowledge-copy, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 41–52. doi:10.18653/v1/2020.acl-main.6. https://www.aclweb.org/anthology/2020.acl-main.6.
– reference: I. Sutskever, O. Vinyals, Q.V. Le, Sequence to sequence learning with neural networks, in: Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8–13 2014, Montreal, Quebec, Canada, 2014, pp. 3104–3112.
– reference: A. Sordoni, M. Galley, M. Auli, C. Brockett, Y. Ji, M. Mitchell, J. Nie, J. Gao, B. Dolan, A neural network approach to context-sensitive generation of conversational responses, in: HLT-NAACL, The Association for Computational Linguistics, 2015, pp. 196–205.
– reference: K. Papineni, S. Roukos, T. Ward, W. Zhu, Bleu: a method for automatic evaluation of machine translation, in: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, July 6-12, 2002, Philadelphia, PA, USA., 2002, pp. 311–318. URL http://www.aclweb.org/anthology/P02-1040.pdf
– reference: J. Gu, Z. Lu, H. Li, V.O. Li, Incorporating copying mechanism in sequence-to-sequence learning, in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, Germany, 2016, pp. 1631–1640. doi:10.18653/v1/P16-1154. https://www.aclweb.org/anthology/P16-1154.
– reference: J. Xu, X. Ren, J. Lin, X. Sun, Diversity-promoting GAN: A cross-entropy based generative adversarial network for diversified text generation, in: E. Riloff, D. Chiang, J. Hockenmaier, J. Tsujii (Eds.), Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, Association for Computational Linguistics, 2018, pp. 3940–3949.
– reference: D.P. Kingma, J. Ba, Adam: A method for stochastic optimization, in: Y. Bengio, Y. LeCun (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings, 2015.
– reference: J. Li, M. Galley, C. Brockett, J. Gao, B. Dolan, A diversity-promoting objective function for neural conversation models, in: NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12–17, 2016, 2016, pp. 110–119. http://aclweb.org/anthology/N/N16/N16-1014.pdf.
– reference: I.V. Serban, A. Sordoni, Y. Bengio, A.C. Courville, J. Pineau, Building end-to-end dialogue systems using generative hierarchical neural network models, in: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12–17, 2016, Phoenix, Arizona, USA., 2016, pp. 3776–3784. http://www.aaai.org/ocs/index.php/AAAI/ AAAI16/paper/view/11957.
– volume: 9
  start-page: 1735
  year: 1997
  end-page: 1780
  ident: b0135
  article-title: Long short-term memory
  publication-title: Neural Computation
– reference: A. Ritter, C. Cherry, W.B. Dolan, Data-driven response generation in social media, in: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, 27–31 July 2011, John McIntyre Conference Centre, Edinburgh, UK, A meeting of SIGDAT, a Special Interest Group of the ACL, 2011, pp. 583–593. http://www.aclweb.org/anthology/D11-1054.
– volume: 5
  start-page: 3
  year: 2001
  end-page: 55
  ident: b0185
  article-title: A mathematical theory of communication
  publication-title: Mobile Computing and Communications Review
– reference: D. Ham, J.-G. Lee, Y. Jang, K.-E. Kim, End-to-end neural pipeline for goal-oriented dialogue systems using GPT-2, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 583–592. doi:10.18653/v1/2020.acl-main.54. URL https://www.aclweb.org/anthology/2020.acl-main.54.
– reference: H. Zhou, T. Young, M. Huang, H. Zhao, J. Xu, X. Zhu, Commonsense knowledge aware conversation generation with graph attention, in: J. Lang (Ed.), Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13–19, 2018, Stockholm, Sweden, ijcai.org, 2018, pp. 4623–4629.
– reference: L. Nie, W. Wang, R. Hong, M. Wang, Q. Tian, Multimodal dialog system: Generating responses via adaptive decoders, in: L. Amsaleg, B. Huet, M.A. Larson, G. Gravier, H. Hung, C. Ngo, W.T. Ooi (Eds.), Proceedings of the 27th ACM International Conference on Multimedia, MM 2019, Nice, France, October 21-25, 2019, ACM, 2019, pp. 1098–1106. doi:10.1145/3343031.3350923. URL https://doi.org/10.1145/3343031.3350923
– reference: R.S. Sreepada, B.K. Patra, Mitigating long tail effect in recommendations using few shot learning technique, Expert System with Applications. 140. doi:10.1016/j.eswa.2019.112887. doi: 10.1016/j.eswa.2019.112887.
– reference: S. Yang, R. Zhang, S. Erfani, GraphDialog: Integrating graph knowledge into end-to-end task-oriented dialogue systems, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, 2020, pp. 1878–1888. doi:10.18653/v1/2020.emnlp-main.147. URL: https://www.aclweb.org/anthology/2020.emnlp-main.147.
– reference: J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional transformers for language understanding, in: J. Burstein, C. Doran, T. Solorio (Eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers), Association for Computational Linguistics, 2019, pp. 4171–4186. https://aclweb.org/anthology/papers/N/N19/N19-1423/.
– reference: D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly learning to align and translate, in: Y. Bengio, Y. LeCun (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings, 2015.
– reference: X. Gao, S. Lee, Y. Zhang, C. Brockett, M. Galley, J. Gao, B. Dolan, Jointly optimizing diversity and relevance in neural response generation, in: J. Burstein, C. Doran, T. Solorio (Eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers), Association for Computational Linguistics, 2019, pp. 1229–1238.
– volume: 164
  start-page: 348
  year: 2019
  end-page: 357
  ident: b0125
  article-title: Recommending the long tail items through personalized diversification
  publication-title: Knowledge-Based Systems
– reference: Z. Zhan, Z. Hou, Q. Yang, J. Zhao, Y. Zhang, C. Hu, SSA: A more humanized automatic evaluation method for open dialogue generation, in: International Joint Conference on Neural Networks, IJCNN 2019 Budapest, Hungary, July 14–19, 2019, IEEE, 2019, pp. 1–8. doi:10.1109/IJCNN.2019.8851960. doi: 10.1109/IJCNN.2019.8851960.
– reference: C.-S. Wu, S.C. Hoi, R. Socher, C. Xiong, TOD-BERT: Pre-trained natural language understanding for task-oriented dialogue, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, 2020, pp. 917–929. doi:10.18653/v1/2020.emnlp-main.66. URL https://www.aclweb.org/anthology/2020.emnlp-main.66
– reference: O. Vinyals, Q.V. Le, A neural conversational model, CoRR abs/1506.05869. arXiv:1506.05869. URL http://arxiv.org/abs/1506.05869
– volume: 7
  start-page: 26146
  year: 2019
  end-page: 26156
  ident: b0160
  article-title: Transformer-based neural network for answer selection in question answering
  publication-title: IEEE Access
– start-page: 5811
  year: 2020
  end-page: 5820
  ident: b0035
  article-title: Diverse and informative dialogue generation with context-specific commonsense knowledge awareness
  publication-title: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
– reference: Y. Shao, S. Gouws, D. Britz, A. Goldie, B. Strope, R. Kurzweil, Generating high-quality and informative conversation responses with sequence-to-sequence models, in: M. Palmer, R. Hwa, S. Riedel (Eds.), Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9–11, 2017, Association for Computational Linguistics, 2017, pp. 2210–2219.
– volume: 10
  start-page: 925
  year: 2019
  end-page: 945
  ident: b0120
  article-title: Hate speech detection: A solved problem? The challenging case of long tail on twitter
  publication-title: Semantic Web
– reference: S. Bao, H. He, F. Wang, H. Wu, H. Wang, PLATO: Pre-trained dialogue generation model with discrete latent variable, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 85–96. doi:10.18653/v1/2020.acl-main.9. https://www.aclweb.org/anthology/2020.acl-main.9.
– start-page: 6000
  year: 2017
  end-page: 6010
  ident: b0155
  article-title: Attention is all you need
  publication-title: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA
– ident: 10.1016/j.neucom.2021.07.039_b0110
  doi: 10.18653/v1/2020.acl-main.7
– ident: 10.1016/j.neucom.2021.07.039_b0095
– ident: 10.1016/j.neucom.2021.07.039_b0175
– ident: 10.1016/j.neucom.2021.07.039_b0045
– ident: 10.1016/j.neucom.2021.07.039_b0070
– ident: 10.1016/j.neucom.2021.07.039_b0065
  doi: 10.3115/v1/D14-1179
– ident: 10.1016/j.neucom.2021.07.039_b0085
  doi: 10.18653/v1/2020.acl-main.9
– ident: 10.1016/j.neucom.2021.07.039_b0005
– ident: 10.1016/j.neucom.2021.07.039_b0015
  doi: 10.18653/v1/P16-1094
– ident: 10.1016/j.neucom.2021.07.039_b0150
– ident: 10.1016/j.neucom.2021.07.039_b0030
  doi: 10.18653/v1/N19-1125
– ident: 10.1016/j.neucom.2021.07.039_b0050
  doi: 10.3115/v1/N15-1020
– ident: 10.1016/j.neucom.2021.07.039_b0180
  doi: 10.3115/1073083.1073135
– ident: 10.1016/j.neucom.2021.07.039_b0165
– volume: 7
  start-page: 26146
  year: 2019
  ident: 10.1016/j.neucom.2021.07.039_b0160
  article-title: Transformer-based neural network for answer selection in question answering
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2019.2900753
– volume: 164
  start-page: 348
  year: 2019
  ident: 10.1016/j.neucom.2021.07.039_b0125
  article-title: Recommending the long tail items through personalized diversification
  publication-title: Knowledge-Based Systems
  doi: 10.1016/j.knosys.2018.11.004
– ident: 10.1016/j.neucom.2021.07.039_b0100
  doi: 10.18653/v1/D17-1235
– start-page: 5811
  year: 2020
  ident: 10.1016/j.neucom.2021.07.039_b0035
  article-title: Diverse and informative dialogue generation with context-specific commonsense knowledge awareness
– ident: 10.1016/j.neucom.2021.07.039_b0115
  doi: 10.18653/v1/2020.emnlp-main.147
– volume: 10
  start-page: 925
  issue: 5
  year: 2019
  ident: 10.1016/j.neucom.2021.07.039_b0120
  article-title: Hate speech detection: A solved problem? The challenging case of long tail on twitter
  publication-title: Semantic Web
  doi: 10.3233/SW-180338
– ident: 10.1016/j.neucom.2021.07.039_b0075
  doi: 10.1145/3343031.3350923
– ident: 10.1016/j.neucom.2021.07.039_b0140
– ident: 10.1016/j.neucom.2021.07.039_b0040
  doi: 10.18653/v1/2020.acl-main.6
– ident: 10.1016/j.neucom.2021.07.039_b0105
  doi: 10.18653/v1/D18-1428
– ident: 10.1016/j.neucom.2021.07.039_b0010
  doi: 10.18653/v1/P16-1154
– volume: 9
  start-page: 1735
  issue: 8
  year: 1997
  ident: 10.1016/j.neucom.2021.07.039_b0135
  article-title: Long short-term memory
  publication-title: Neural Computation
  doi: 10.1162/neco.1997.9.8.1735
– ident: 10.1016/j.neucom.2021.07.039_b0060
  doi: 10.1109/IJCNN.2019.8851960
– ident: 10.1016/j.neucom.2021.07.039_b0170
– ident: 10.1016/j.neucom.2021.07.039_b0025
  doi: 10.24963/ijcai.2018/606
– ident: 10.1016/j.neucom.2021.07.039_b0090
  doi: 10.18653/v1/2020.acl-main.54
– ident: 10.1016/j.neucom.2021.07.039_b0130
  doi: 10.1016/j.eswa.2019.112887
– ident: 10.1016/j.neucom.2021.07.039_b0020
  doi: 10.24963/ijcai.2018/643
– start-page: 6000
  year: 2017
  ident: 10.1016/j.neucom.2021.07.039_b0155
  article-title: Attention is all you need
– volume: 5
  start-page: 3
  issue: 1
  year: 2001
  ident: 10.1016/j.neucom.2021.07.039_b0185
  article-title: A mathematical theory of communication
  publication-title: Mobile Computing and Communications Review
  doi: 10.1145/584091.584093
– ident: 10.1016/j.neucom.2021.07.039_b0055
– ident: 10.1016/j.neucom.2021.07.039_b0145
– ident: 10.1016/j.neucom.2021.07.039_b0080
  doi: 10.18653/v1/2020.acl-main.131
SSID ssj0017129
Score 2.3865654
Snippet Recent neural models have shown significant progress in dialogue generation. Among those models, most of them are based on language models, yielding the...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 374
SubjectTerms Data normalization
Dialogue generation
Diversity
Informativeness
Long Tail
Title Grabbing the Long Tail: A data normalization method for diverse and informative dialogue generation
URI https://dx.doi.org/10.1016/j.neucom.2021.07.039
Volume 460
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEB60Xrz4Fp9lD17XbvaRNN6KqPV50UJvIfuSSkmL1Ku_3Z1kIwqi4CUkIQNhspn5JvnmG4ATq2TqvexTE7A1lbbMqLbCUiWUVl44buo-7vuHdDiSN2M1XoLzthcGaZUx9jcxvY7W8UwverM3n0x6jyznoYpKeChakgAkxDKscJGnqgMrg-vb4cPnz4Qs4Y3kHlcUDdoOuprmVbk3pI3wkOtqFU-cGv5ThvqSdS43YC3CRTJo7mgTlly1BevtKAYS38xtMFevpQ5F7jMJgI7czcLOUzmZnpEBQQ4oqRCaTmPPJWnGRpOAV4mteRmOlJUlUUQVAyDBfhL8qkOea11qNNuB0eXF0_mQxvkJ1IRCYEG9tiEVSVZmDkV5MpmnqC-oUHHKayYzIwJcyJhOmJdlavvW5Nx5q63OteJa7EKnmlVuDwjzacpdoo1nfRk2WhktUHrOKFUy7vZBtD4rTBQXxxkX06Jlkb0UjacL9HTBsiJ4eh_op9W8Edf44_qsfRzFt0VShPj_q-XBvy0PYRWPMF0l8gg6i9c3dxxwyEJ3Yfn0PenG1fYBFmfeNA
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF58HPTiW3y7B69rN_vItt5E1KptL1bobcm-pFJikXr1t7uTbIqCKHgJIdmBMNnMfBO--QahMydFHoJoExuxNRGuUMQ47ojk0sjAPbNVH3d_kHefxP1IjhbQVdMLA7TKFPvrmF5F63SllbzZmo7HrUfaYbGKylgsWrIIJPgiWhaSK-D1nX_MeR6ZylgtuMckgeVN_1xF8ir9O5BGWMx0lYYnzAz_KT99yTk3G2gtgUV8WT_PJlrw5RZabwYx4PRdbiN7-1aYWOI-4wjncO81ngyL8eQCX2JggOISgOkkdVziemg0jmgVu4qV4XFROpwkVCH8YegmgX86-LlSpQazHfR0cz286pI0PYHYWAbMSDAuJiJBC-VBkkeJTg7qghL0poKhQlkewYKiJqNBFLlrO9thPjjjTMdIZvguWipfS7-HMA15znxmbKBtEQ9GWsNBeM5KWVDm9xFvfKZtkhaHCRcT3XDIXnTtaQ2e1lTp6Ol9ROZW01pa44_1qnkd-tsW0TH6_2p58G_LU7TSHfZ7unc3eDhEq3AHElcmjtDS7O3dH0dEMjMn1Y77BPb13v8
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Grabbing+the+Long+Tail%3A+A+data+normalization+method+for+diverse+and+informative+dialogue+generation&rft.jtitle=Neurocomputing+%28Amsterdam%29&rft.au=Zhan%2C+Zhiqiang&rft.au=Zhao%2C+Jianyu&rft.au=Zhang%2C+Yang&rft.au=Gong%2C+Jiangtao&rft.date=2021-10-14&rft.pub=Elsevier+B.V&rft.issn=0925-2312&rft.eissn=1872-8286&rft.volume=460&rft.spage=374&rft.epage=384&rft_id=info:doi/10.1016%2Fj.neucom.2021.07.039&rft.externalDocID=S0925231221010833
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0925-2312&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0925-2312&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0925-2312&client=summon