MIXTURE-OF-EXPERTS MODEL IMPLEMENTATION METHOD AND SYSTEM, ELECTRONIC DEVICE, AND STORAGE MEDIUM

The present disclosure provides a mixture-of-experts (MoE) model implementation method and system, an electronic device, and a storage medium, and relates to the field of artificial intelligence (AI) such as deep learning and distributed storage. The method includes: constructing a communication gro...

Full description

Saved in:
Bibliographic Details
Main Authors SHEN, Liang, YU, Dianhai, WU, Zhihua, GONG, Weibao, WU, Huachao, WANG, Haifeng
Format Patent
LanguageEnglish
French
German
Published 06.12.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The present disclosure provides a mixture-of-experts (MoE) model implementation method and system, an electronic device, and a storage medium, and relates to the field of artificial intelligence (AI) such as deep learning and distributed storage. The method includes: constructing a communication group, the communication group including a tensor-parallelism communication group, the tensor-parallelism communication group including at least two computing devices, tensor-parallelism segmentation being adopted for sparse parameters of each of the computing devices in a same tensor-parallelism communication group; and training an MoE model based on the communication group. By use of the solutions of the present disclosure, normal operation of model training can be guaranteed.
Bibliography:Application Number: EP20220865889