How Does the Data set Affect CNN-based Image Classification Performance?

Convolutional neural networks (ConvNets or CNNs) have been proven very effective in areas such as image recognition and classification. Especially in the field of image classification, the CNN-based method has achieved excellent performance. Performance is an important indicator for evaluating wheth...

Full description

Saved in:
Bibliographic Details
Published inICSAI 2018 : 2018 5th International Conference on Systems and Informatics : 10-12 November 2018, Jinyuanbao Hotel, Nanjing, China pp. 361 - 366
Main Authors Luo, Chao, Li, Xiaojie, Wang, Lutao, He, Jia, Li, Denggao, Zhou, Jiliu
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.11.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Convolutional neural networks (ConvNets or CNNs) have been proven very effective in areas such as image recognition and classification. Especially in the field of image classification, the CNN-based method has achieved excellent performance. Performance is an important indicator for evaluating whether a CNN-based classification method is excellent, so it is important to study which factors affect performance. As we all know, image classification performance is affected by the network structure itself and the size of the data set. In particular, data set size have a significant impact on performance. While for most people, a large number of data set are difficult to obtain. Thus, we consider a question of this approach: How does the size of the data set affect performance? In order to clarify this issue, there are 35 groups experiment performed with 5 times experiment in each group (175 experiments in total). For each k-classification experiment, we do 5 groups by increasing the size of the training set. Observe changes in accuracy to analyze the effect of data set size on difference. For the same CNN-based network, experimental results of average accuracy illustrate that the larger the training set, the higher the test accuracy. However, when the training data set are insufficient, better results can be obtained. Furthermore, in each group experiment, the more categories that are classified, the more obvious the performance change. Results of this paper not only can guide us to do experiments on image classification, but also have important guiding significance for other experiments based on deep learning.
DOI:10.1109/ICSAI.2018.8599448