PROPER SCORING RULES FOR SURVIVAL ANALYSIS Anonymous

Abstract

Survival analysis is the problem of estimating probability distributions for future events, which can be seen as a problem in uncertainty quantification. Although there are fundamental theories on strictly proper scoring rules for uncertainty quantification, little is known about those for survival analysis. In this paper, we investigate extensions of four major strictly proper scoring rules for survival analysis. Through the extensions, we discuss and clarify the assumptions arising from the discretization of the estimation of probability distributions. We also discuss the relationship between the existing algorithms and extended scoring rules, and we propose new algorithms based on our extensions of the scoring rules for survival analysis.

1. INTRODUCTION

The theory of scoring rules is a fundamental theory in statistical analysis, and it is widely used in uncertainty quantification (see, e.g., Mura et al. (2008) ; Parmigiani & Inoue (2009) ; Benedetti (2010); Schlag et al. (2015) ). Suppose that there is a random variable Y whose cumulative distribution function (CDF) is F Y . Given an estimation FY of F Y and a single sample y obtained from Y , a scoring rule S( FY , y) is a function that returns an evaluation score for FY based on y. Since FY is a CDF and y is a single sample of Y , it is not straightforward to choose an appropriate scoring rule S( FY , y). The theory of scoring rules suggests strictly proper scoring rules that can be used to recover the true probability distribution F Y by optimizing the scoring rules. This theory shows that there are infinitely many strictly proper scoring rules, and examples of them include the pinball loss, the logarithmic score, the Brier score, and the ranked probability score (see, e.g., Gneiting & Raftery (2007) for the definitions of these scoring rules). Survival analysis, which is also known as time-to-event analysis, can be seen a problem in uncertainty quantification. Despite the long history of research on survival analysis (see, e.g., Wang et al. (2019) for a comprehensive survey), little is known about the strictly proper scoring rules for survival analysis. Therefore, this paper investigates extensions of these scoring rules for survival analysis. Survival analysis is the problem of estimating probability distributions for future events. In healthcare applications, an event usually corresponds to an undesirable event for a patient (e.g., a death or the onset of disease). The time between a well-defined starting point and the occurrence of an event is called the survival time or event time. Survival analysis has important applications in many fields such as credit scoring (Dirick et al., 2017) and fraud detection (Zheng et al., 2019) as well as healthcare. Although we discuss survival analysis in the context of healthcare applications, we can use the extended scoring rules for any other applications. Datasets for survival analysis are censored, which means that events of interest might not be observed for a number of data points. This may be due to either the limited observation time window or missing traces caused by other irrelevant events. In this paper, we consider only right censored data, which is a widely studied problem setting in survival analysis. The exact event time of a right censored data point is unknown; we know only that the event had not happened up to a certain time for the data point. The time between a well-defined starting point and the last observation time of a right censored data point is called the censoring time. One of the classical methods for survival analysis is the Kaplan-Meier estimator (Kaplan & Meier, 1958) . It is a non-parametric method for estimating the probability distribution of survival times as a survival function κ(t), where the value κ(t) represents the survival rate at time t (i.e., the ratio of 1

