PRETRAINING OF SPLIT LAYER PORTIONS FOR MULTILINGUAL MODEL
A method, computer system, and a computer program product for training a machine learning model are provided. A machine learning model may be split into a lower portion and an upper portion. The lower portion includes at least one layer. The upper portion includes at least one layer. The lower porti...
Saved in:
Main Authors | , , , |
---|---|
Format | Patent |
Language | English |
Published |
13.06.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | A method, computer system, and a computer program product for training a machine learning model are provided. A machine learning model may be split into a lower portion and an upper portion. The lower portion includes at least one layer. The upper portion includes at least one layer. The lower portion may be pre-trained via a generator task and via alternating between inputting of monolingual text data and multilingual text data. The upper portion may be pre-trained via a discriminator task. The pre-trained lower portion may be joined to the pre-trained upper portion to form a trained multilingual machine learning model. |
---|---|
Bibliography: | Application Number: US202218063788 |