A Neural Architecture Search for Automated Multimodal Learning
The boom of artificial intelligence in the past decade is owed to the research and development of deep learning and moreover, that of accessible deep learning. But the goal of Artificial General Intelligence (AGI) cannot be achieved by having application-specific, parameter sensitive neural networks...
Saved in:
Published in | Expert systems with applications Vol. 207; p. 118051 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
30.11.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The boom of artificial intelligence in the past decade is owed to the research and development of deep learning and moreover, that of accessible deep learning. But the goal of Artificial General Intelligence (AGI) cannot be achieved by having application-specific, parameter sensitive neural networks that need to be defined and tuned for every use case. General intelligence also involves understanding different types of data, rather than having dedicated models for each functionality. Thus both automating machine learning while also giving importance to generalizing over multiple modalities has great potential to help move AGI research forward.
We propose a generalizable algorithm-Multimodal Neural Architecture Search (MNAS) which can work on multiple modalities and perform architecture search in order to create neural networks that enable classification on multiple types of data for multiclass outputs. The work automates the development of a fusion architecture by building upon existing literature of multimodal learning and neural architecture search. The controller network which predicts the architecture has been designed such that it works on a reward model where the reward is dependent on accuracies of individual networks corresponding to each modality involved. The work shows good results with accuracy comparable to both unimodal classification on same data and manually created multimodal architectures wherein the experiments are performed on multiclass classification problem of image and text modalities. It also uses a shared parameter search graph ensuring that the computational complexity is less compared to several other neural architecture search algorithms. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2022.118051 |