MIXTURE-OF-EXPERTS MODEL IMPLEMENTATION METHOD AND SYSTEM, ELECTRONIC DEVICE, AND STORAGE MEDIUM

The present disclosure provides a mixture-of-experts (MoE) model implementation method and system, an electronic device, and a storage medium, and relates to the field of artificial intelligence (AI) such as deep learning and distributed storage. The method includes: constructing a communication gro...

Full description

Saved in:

Bibliographic Details
Main Authors	SHEN, Liang, YU, Dianhai, WU, Zhihua, GONG, Weibao, WU, Huachao, WANG, Haifeng
Format	Patent
Language	English French German
Published	06.12.2023
Subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The present disclosure provides a mixture-of-experts (MoE) model implementation method and system, an electronic device, and a storage medium, and relates to the field of artificial intelligence (AI) such as deep learning and distributed storage. The method includes: constructing a communication group, the communication group including a tensor-parallelism communication group, the tensor-parallelism communication group including at least two computing devices, tensor-parallelism segmentation being adopted for sparse parameters of each of the computing devices in a same tensor-parallelism communication group; and training an MoE model based on the communication group. By use of the solutions of the present disclosure, normal operation of model training can be guaranteed.
Bibliography:	Application Number: EP20220865889