LIPSFUS: A neuromorphic dataset for audio-visual sensory fusion of lip reading
This paper presents a sensory fusion neuromorphic dataset collected with precise temporal synchronization using a set of Address-Event-Representation sensors and tools. The target application is the lip reading of several keywords for different machine learning applications, such as digits, robotic...
Saved in:
Main Authors | , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
28.03.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | This paper presents a sensory fusion neuromorphic dataset collected with
precise temporal synchronization using a set of Address-Event-Representation
sensors and tools. The target application is the lip reading of several
keywords for different machine learning applications, such as digits, robotic
commands, and auxiliary rich phonetic short words. The dataset is enlarged with
a spiking version of an audio-visual lip reading dataset collected with
frame-based cameras. LIPSFUS is publicly available and it has been validated
with a deep learning architecture for audio and visual classification. It is
intended for sensory fusion architectures based on both artificial and spiking
neural network algorithms. |
---|---|
DOI: | 10.48550/arxiv.2304.01080 |