Knowledge Transferred Fine-tuning: Convolutional Neural Network is Born Again with Anti-Aliasing even in Data-limited Situations

Anti-aliased convolutional neural networks (CNNs) are models that introduce blur filters to intermediate representations of CNNs to achieve high accuracy in image recognition tasks. A promising way to prepare a new anti-aliased CNN is to introduce blur filters to the intermediate representations of...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 10; p. 1
Main Authors Suzuki, Satoshi, Takeda, Shoichiro, Makishima, Naoki, Ando, Atsushi, Masumura, Ryo, Shouno, Hayaru
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Anti-aliased convolutional neural networks (CNNs) are models that introduce blur filters to intermediate representations of CNNs to achieve high accuracy in image recognition tasks. A promising way to prepare a new anti-aliased CNN is to introduce blur filters to the intermediate representations of pre-trained (non anti-aliased) CNNs, since many researchers have released them online. Although this scheme can build the new anti-aliased CNN easily, the blur filters drastically degrade the pre-trained representations. Therefore, to take full advantage of the benefits of introducing blur filters, fine-tuning using massive amounts of training data is often required. This can be problematic because the training data is often limited. In such a "data-limited" situation, the fine-tuning does not bring about a high performance because it induces overfitting to the limited training data. To tackle this problem, we propose "knowledge transferred fine-tuning." Knowledge transfer is a technique that utilizes the representations of a pre-trained model to help ensure generalization in data-limited situations. Inspired by this concept, we transfer knowledge from intermediate representations in a pre-trained CNN to an anti-aliased CNN while fine-tuning. The key idea of our method is to transfer only the essential knowledge for image recognition in the pre-trained CNN using two types of loss: pixel-level loss and global-level loss. The former loss transfers the detailed knowledge from the pre-trained CNN, but this knowledge may contain "aliased" non-essential knowledge. The latter loss, on the other hand, is designed to increase when the pixel-level loss transfers non-essential knowledge while ignoring the essential knowledge, i.e., it penalizes the pixel-level loss. Experimental results demonstrate that the proposed method using just 25 training images per class on ImageNet 2012 can achieve higher accuracy than a conventional pre-trained CNN.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2022.3186101