Towards Memory-Efficient Training for Extremely Large Output Spaces – Learning with 670k Labels on a Single Commodity GPU

In classification problems with large output spaces (up to millions of labels), the last layer can require an enormous amount of memory. Using sparse connectivity would drastically reduce the memory requirements, but as we show below, applied naïvely it can result in much diminished predictive perfo...

Full description

Saved in:
Bibliographic Details
Published inMachine Learning and Knowledge Discovery in Databases: Research Track pp. 689 - 704
Main Authors Schultheis, Erik, Babbar, Rohit
Format Book Chapter
LanguageEnglish
Published Cham Springer Nature Switzerland 2023
SeriesLecture Notes in Computer Science
Online AccessGet full text

Cover

Loading…