Towards Memory-Efficient Training for Extremely Large Output Spaces – Learning with 670k Labels on a Single Commodity GPU
In classification problems with large output spaces (up to millions of labels), the last layer can require an enormous amount of memory. Using sparse connectivity would drastically reduce the memory requirements, but as we show below, applied naïvely it can result in much diminished predictive perfo...
Saved in:
Published in | Machine Learning and Knowledge Discovery in Databases: Research Track pp. 689 - 704 |
---|---|
Main Authors | , |
Format | Book Chapter |
Language | English |
Published |
Cham
Springer Nature Switzerland
2023
|
Series | Lecture Notes in Computer Science |
Online Access | Get full text |
Cover
Loading…
Be the first to leave a comment!