USING THE TRAINING HISTORY TO DETECT AND PRE-VENT OVERFITTING IN DEEP LEARNING MODELS Anonymous authors Paper under double-blind review

Abstract

Overfitting occurs in deep learning models when instead of learning from the training data, they tend to memorize it, resulting in poor generalizability. Overfitting can be (1) prevented (e.g., using dropout or early stopping) or (2) detected in a trained model (e.g., using correlation-based methods). We propose a method that can both detect and prevent overfitting based on the training history (i.e., validation losses). Our method first trains a time series classifier on training histories of overfit models. This classifier is then used to detect if a trained model is overfit. In addition, our trained classifier can be used to prevent overfitting by identifying the optimal point to stop a model's training. We evaluate our method on its ability to identify and prevent overfitting in real-world samples (collected from papers published in the last 5 years at top AI venues). We compare our method against correlation-based detection methods and the most commonly used prevention method (i.e., early stopping). Our method achieves an F1 score of 0.91 which is at least 5% higher than the current best-performing non-intrusive overfitting detection method. In addition, our method can find the optimal epoch and avoid overfitting at least 32% earlier than early stopping and achieve at least the same rate (often better) of achieving the optimal epoch as early stopping.

Training loss Validation loss

Overfitting is one of the fundamental issues that plagues the field of machine learning (Nowlan & Hinton, 1992; Ng, 1997; Caruana et al., 2000; Cawley & Talbot, 2007; Erhan et al., 2010; Srivastava et al., 2014; Zhao et al., 2020) , which can also occur when training a deep learning (DL) model. An overfit model increases the risk of inaccurate predictions, misleading feature importance, and wasted resources (Hawkins, 2004 ). Figure 1 shows example training histories (i.e., the training and validation losses curves) of an overfit and a non-overfit model. The training and validation losses of the overfit model both decrease at the beginning of the training process. Following that, the validation loss increases while the training loss decreases, resulting in a large gap between the training and validation losses. Such a trend indicates that the trained model is not generalizing well to new data. Currently, the problem of overfitting is addressed by either (1) preventing it from happening in the first place or (2) detecting it in a trained model. Overfitting prevention methods stop overfitting from happening through methods such as early stopping (Morgan & Bourlard, 1989) , data augmentation (Shorten & Khoshgoftaar, 2019), regularization (Kukačka et al., 2017) , modifying the model by adding dropout layers (Srivastava et al., 2014) or batch normalization (Ioffe & Szegedy, 2015) . Many of these methods are intrusive and require modifying the data or the model structure and expertise to execute correctly. Furthermore, even the non-intrusive prevention methods such as early stopping incur a trade-off between model accuracy and training time (Prechelt, 2012) . For example, when using the early stopping method, stopping too late may 1



Figure 1: Example training histories of overfit and nonoverfit models.

