Parallel orthogonal deep neural network

Ensemble learning methods combine multiple models to improve performance by exploiting their diversity. The success of these approaches relies heavily on the dissimilarity of the base models forming the ensemble. This diversity can be achieved in many ways, with well-known examples including bagging...

Full description

Saved in:
Bibliographic Details
Published inNeural networks Vol. 140; pp. 167 - 183
Main Authors Mashhadi, Peyman Sheikholharam, Nowaczyk, Sławomir, Pashami, Sepideh
Format Journal Article
LanguageEnglish
Published United States Elsevier Ltd 01.08.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Ensemble learning methods combine multiple models to improve performance by exploiting their diversity. The success of these approaches relies heavily on the dissimilarity of the base models forming the ensemble. This diversity can be achieved in many ways, with well-known examples including bagging and boosting. It is the diversity of the models within an ensemble that allows the ensemble to correct the errors made by its members, and consequently leads to higher classification or regression performance. A mistake made by a base model can only be rectified if other members behave differently on that particular instance, and provide the aggregator with enough information to make an informed decision. On the contrary, lack of diversity not only lowers model performance, but also wastes computational resources. Nevertheless, in the current state of the art ensemble approaches, there is no guarantee on the level of diversity achieved, and no mechanism ensuring that each member will learn a different decision boundary from the others. In this paper, we propose a parallel orthogonal deep learning architecture in which diversity is enforced by design, through imposing an orthogonality constraint. Multiple deep neural networks are created, parallel to each other. At each parallel layer, the outputs of different base models are subject to Gram–Schmidt orthogonalization. We demonstrate that this approach leads to a high level of diversity from two perspectives. First, the models make different errors on different parts of feature space, and second, they exhibit different levels of uncertainty in their decisions. Experimental results confirm the benefits of the proposed method, compared to standard deep learning models and well-known ensemble methods, in terms of diversity and, as a result, classification performance. •We propose a novel deep learning ensemble method with the enforced diversity mechanism.•We introduce a new approach for analyzing ensemble diversity, taking into account model confidence.•We perform experimental evaluation and show improved classification performance.•We demonstrate an efficient way of building deep learning ensembles with end-to-end architecture.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0893-6080
1879-2782
1879-2782
DOI:10.1016/j.neunet.2021.03.002