PRETRAINING OF SPLIT LAYER PORTIONS FOR MULTILINGUAL MODEL

A method, computer system, and a computer program product for training a machine learning model are provided. A machine learning model may be split into a lower portion and an upper portion. The lower portion includes at least one layer. The upper portion includes at least one layer. The lower porti...

Full description

Saved in:
Bibliographic Details
Main Authors Kunc, Ladislav, PAN, LIN, Qi, Haode, Potdar, Saloni
Format Patent
LanguageEnglish
Published 13.06.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A method, computer system, and a computer program product for training a machine learning model are provided. A machine learning model may be split into a lower portion and an upper portion. The lower portion includes at least one layer. The upper portion includes at least one layer. The lower portion may be pre-trained via a generator task and via alternating between inputting of monolingual text data and multilingual text data. The upper portion may be pre-trained via a discriminator task. The pre-trained lower portion may be joined to the pre-trained upper portion to form a trained multilingual machine learning model.
Bibliography:Application Number: US202218063788