Model training method, server, chip and system

The invention provides a model training method, device and system. The method comprises the following steps: training a model by using a first training resource; and when the first training resource fails, based on the model parameters of the model when the first training resource fails, continuing...

Full description

Saved in:
Bibliographic Details
Main Authors CHEN YUBIN, ZHENG KUN, CAI ZHIFANG
Format Patent
LanguageChinese
English
Published 23.08.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The invention provides a model training method, device and system. The method comprises the following steps: training a model by using a first training resource; and when the first training resource fails, based on the model parameters of the model when the first training resource fails, continuing to use a second training resource to train the model. According to the technical scheme, after the hardware training resources fail, the training task can be quickly recovered, and the training time is not lost. 本申请提供了一种模型训练的方法、装置以及系统,该方法包括:使用第一训练资源训练模型;在所述第一训练资源故障时,基于所述模型在所述第一训练资源故障时的模型参数,继续使用第二训练资源对所述模型进行训练。上述技术方案可以在硬件训练资源发生故障后,能够快速恢复训练任务,且不损失训练的时间。
Bibliography:Application Number: CN202111028409