USING OBJECT-FOCUSED IMAGES AS AN IMAGE AUGMENTATION TECHNIQUE TO IMPROVE THE ACCURACY OF IMAGE-CLASSIFICATION MODELS WHEN VERY LIMITED DATA SETS ARE AVAILABLE Anonymous

Abstract

Today, many of the machine learning models are extremely data hungry. On the other hand, the accuracy of the algorithms used is very often affected by the amount of the training data available, which is, unfortunately, rarely abundant. Fortunately, image augmentation is one of the very powerful techniques that can be used by computer-vision engineers to expand their existing image data sets. This paper presents an innovative way for creating a variation of existing images and introduces the idea of using an Object-Focused Image (OFI). This is when an image includes only the labeled object and everything else is made transparent. The objective of OFI method is to expand the existing image data set and hence improve the accuracy of the model used to classify images. This paper also elaborates on the OFI approach and compares the accuracy of five different models with the same network design and settings but with different content of the training data set. The experiments presented in this paper show that using OFIs along with the original images can lead to an increase in the validation accuracy of the used model. In fact, when the OFI technique is used, the number of the images supplied nearly doubles.

1. INTRODUCTION

Nowadays, Convolutional Neural Networks (CNNs) are among the most common tools used for image classification. For machine learning (ML) problems such as image classification, the size of the training image set is very important to build high-accuracy classifiers. As the popularity of CNNs grows, so does the interest in data augmentation. Although augmented data sets are sometimes artificial, they are very similar to the original data sets. Thus, augmentation can make a network learn more useful representations because of the increase of training data. In this paper, the researchers propose a new method to produce new images from existing ones and therefore augment data. Several experiments were conducted to validate that the model could benefit further from using the mixed data set of old and new images. This paper compares five models with the same design and settings. The only difference is the set of supplied images. The first model uses the original 2,000 images of dogs and cats. The second and the third use the OFI version of those 2,000 images. The fourth and the fifth use all the images: the original images as well as the OFI images. There are two methods to obtain the OFIs. The automated method uses an Application Programming Interface (API) to remove the background of the image and leave only the labeled object (a cat or a dog) in the foreground. The OFIs produced by the first method are called automatic OFIs. The other method is executed manually: every image is edited by a human expert who will remove the background or any additional object in the image. The OFIs produced by this second method are called manual OFIs. The manual method is more accurate than the automated API method and might lead to better results. The only difference between the second and the third models is the fact that the second model uses the automatic method while the third model uses the manual method. This is also the difference between the fourth and the fifth models. In this paper, the five models are tested to answer the following questions:

