FVLLMONTI: The 3D Neural Network Compute Cube (N^C^) Concept for Efficient Transformer Architectures Towards Speech-to-Speech Translation

This multi-partner-project contribution introduces the midway results of the Horizon 2020 FVLLMONTI project. In this project we develop a new and ultra-efficient class of ANN accelerators, the neural network compute cube (N^{2}C^{2}) , which is specifically designed to execute complex machine learni...

Full description

Saved in:

Bibliographic Details
Published in	2024 Design, Automation & Test in Europe Conference & Exhibition (DATE) pp. 1 - 6
Main Authors	O'Connor, Ian, Mannaa, Sara, Bosio, Alberto, Deveautour, Bastien, Deleruyelle, Damien, Obukhova, Tetiana, Marchand, Cedric, Trommer, Jens, Cakirlar, Cigdem, Wesling, Bruno Neckel, Mikolajick, Thomas, Baumgartner, Oskar, Thesberg, Mischa, Pirker, David, Lenz, Christoph, Stanojevic, Zlatan, Karner, Markus, Larrieu, Guilhem, Pelloquin, Sylvain, Moustakas, Konstantinous, Muller, Jonas, Ansaloni, Giovanni, Amirshahi, Alireza, Atienza, David, Rouas, Jean-Luc, Letaifa, Leila Ben, Bordeall, Georgeta, Brazier, Charles, Mukherjee, Chhandak, Deng, Marina, Wang, Yifan, Francois, Marc, Rezgui, Houssem, Lucas, Reveil, Maneux, Cristell
Format	Conference Proceeding
Language	English
Published	EDAA 25.03.2024
Subjects	ANN Computer architecture DTCO emerging technologies Machine translation Microprocessors Neural networks Systolic arrays Three-dimensional displays Transformers translation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This multi-partner-project contribution introduces the midway results of the Horizon 2020 FVLLMONTI project. In this project we develop a new and ultra-efficient class of ANN accelerators, the neural network compute cube (N^{2}C^{2}) , which is specifically designed to execute complex machine learning tasks in a 3D technology, in order to provide the high computing power and ultra-high efficiency needed for future edgeAI applications. We showcase its effectiveness by targeting the challenging class of Transformer ANNs, tailored for Automatic Speech Recognition and Machine Translation, the two fundamental components of speech-to-speech translation. To gain the full benefit of the accelerator design, we develop disruptive vertical transistor technologies and execute design-technology-co-optimization (DTCO) loops from single device, to cell and compute cube level. Further, a hardware-software-co-optimization is executed, e.g. by compressing the executed speech recognition and translation models for energy efficient executing without substantial loss in precision.
ISSN:	1558-1101
DOI:	10.23919/DATE58400.2024.10546700