Non‐Autoregressive Translation Algorithm Based on LLM Knowledge Distillation in English Corpus

ABSTRACT Although significant advancements have been made in the quality of machine translation by large‐scale language models, their high computational costs and resource consumption have hindered their widespread adoption in practical applications. So this research introduces an English corpus‐bas...

Full description

Saved in:

Bibliographic Details
Published in	Engineering reports (Hoboken, N.J.) Vol. 7; no. 1
Main Authors	Ju, Fang, Wang, Weihui
Format	Journal Article
Language	English
Published	Hoboken, USA John Wiley & Sons, Inc 01.01.2025 Wiley
Subjects	Algorithms Computing costs Dynamic programming English corpus English language Entropy Knowledge knowledge distillation Knowledge management Language large language model Large language models Learning Machine translation Semantics Variables
Online Access	Get full text

Cover

Loading…

More Information
Summary:	ABSTRACT Although significant advancements have been made in the quality of machine translation by large‐scale language models, their high computational costs and resource consumption have hindered their widespread adoption in practical applications. So this research introduces an English corpus‐based machine translation algorithm that leverages knowledge distillation from large language model, with the goal of enhancing translation quality and reducing the computational demands of the model. Initially, we conducted a thorough analysis of the English corpus to identify prevalent language patterns and structures. Following this, we developed a knowledge distillation approach that transfers the translation expertise of a large teacher model to a smaller student model, thereby achieving increased translation accuracy and efficiency. We designed a dynamic temperature hyperparameter distillation strategy that effectively enhances the precision of translations. In the experimental phase, we utilized several standard English corpora to train and assess our algorithm. The findings indicate that, compared to current machine translation systems, our method significantly reduces the need for computational resources while preserving translation quality. This research introduces an English corpus‐based machine translation algorithm that leverages knowledge distillation from large language model, with the goal of enhancing translation quality and reducing the computational demands of the model.
Bibliography:	Funding This work was supported by “Study on Project‐based Learning Model with Micro‐lecture in College English Teaching” from the Education Department of Shanxi Province (J20221032). ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2577-8196 2577-8196
DOI:	10.1002/eng2.13077