TRUSTED MULTI-VIEW CLASSIFICATION

Abstract

Multi-view classification (MVC) generally focuses on improving classification accuracy by using information from different views, typically integrating them into a unified comprehensive representation for downstream tasks. However, it is also crucial to dynamically assess the quality of a view for different samples in order to provide reliable uncertainty estimations, which indicate whether predictions can be trusted. To this end, we propose a novel multi-view classification method, termed trusted multi-view classification, which provides a new paradigm for multi-view learning by dynamically integrating different views at an evidence level. The algorithm jointly utilizes multiple views to promote both classification reliability and robustness by integrating evidence from each view. To achieve this, the Dirichlet distribution is used to model the distribution of the class probabilities, parameterized with evidence from different views and integrated with the Dempster-Shafer theory. The unified learning framework induces accurate uncertainty and accordingly endows the model with both reliability and robustness for out-ofdistribution samples. Extensive experimental results validate the effectiveness of the proposed model in accuracy, reliability and robustness.

1. INTRODUCTION

Multi-view data, typically associated with multiple modalities or multiple types of features, often exists in real-world scenarios. State-of-the-art multi-view learning methods achieve tremendous success across a wide range of real-world applications. However, this success typically relies on complex models (Wang et al., 2015a; Tian et al., 2019; Bachman et al., 2019; Zhang et al., 2019; Hassani & Khasahmadi, 2020) , which tend to integrate multi-view information with deep neural networks. Although these models can provide accurate classification results, they are usually vulnerable to yield unreliable predictions, particularly when presented with views that are not well-represented (e.g., information from abnormal sensors). Consequently, their deployment in safety-critical applications (e.g., computer-aided diagnosis or autonomous driving) is limited. This has inspired us to introduce a new paradigm for multi-view classification to produce trusted decisions. For multi-view learning, traditional algorithms generally assume an equal value for different views or assign/learn a fixed weight for each view. The underlying assumption is that the qualities or importance of these views are basically stable for all samples. In practice, the quality of a view often varies for different samples which the designed models should be aware of for adaption. For example, in multi-modal medical diagnosis (Perrin et al., 2009; Sui et al., 2018) , a magnetic resonance (MR) image may be sufficient for one subject, while a positron emission tomography (PET) image may be required for another. Therefore, the decision should be well explained according to multi-view inputs. Typically, we not only need to know the classification result, but should also be able to answer "How confident is the decision?" and "Why is the confidence so high/low for the decision?". To this end, the model should provide in accurate uncertainty for the prediction of each sample, and even individual view of each sample. Uncertainty-based algorithms can be roughly divided into two main categories, i.e., Bayesian and non-Bayesian approaches. Traditional Bayesian approaches estimate uncertainty by inferring a posterior distribution over the parameters (MacKay, 1992a; Bernardo & Smith, 2009; Neal, 2012) . A variety of Bayesian methods have been developed, including Laplace approximation (MacKay, 1992b) , Markov Chain Monte Carlo (MCMC) (Neal, 2012) and variational techniques (Graves, 2011; Ranganath et al., 2014; Blundell et al., 2015) . However, compared with ordinary neural networks, due to the doubling of model parameters and difficulty in convergence, these methods are computationally expensive. Recent algorithm (Gal & Ghahramani, 2016) estimates the uncertainty by introducing dropout (Srivastava et al., 2014) in the testing phase, thereby reducing the computational cost. Several non-Bayesian algorithms have been proposed, including deep ensemble (Lakshminarayanan et al., 2017) , evidential deep learning (Sensoy et al., 2018) and deterministic uncertainty estimate (van Amersfoort et al., 2020) . Unfortunately, all of these methods focus on estimating the uncertainty on single-view data, despite the fact that fusing multiple views through uncertainty can improve performance and reliability. In this paper, we propose a new multi-view classification algorithm aiming to elegantly integrate multiview information for trusted decision making (shown in Fig. 1(a) ). Our model combines different views at an evidence level instead of feature or output level as done previously, which produces a stable and reasonable uncertainty estimation and thus promotes both classification reliability and robustness. The Dirichlet distribution is used to model the distribution of the class probabilities, parameterized with evidence from different views and integrated with the Dempster-Shafer theory. In summary, the specific contributions of this paper are: (1) We propose a novel multi-view classification model aiming to provide trusted and interpretable (according to the uncertainty of each view) decisions in an effective and efficient way (without any additional computations and neural network changes), which introduces a new paradigm in multi-view classification. (2) The proposed model is a unified framework for promising sample-adaptive multi-view integration, which integrates multi-view information at an evidence level with the Dempster-Shafer theory in an optimizable (learnable) way. (3) The uncertainty of each view is accurately estimated, enabling our model to improve classification reliability and robustness. (4) We conduct extensive experiments which validate the superior accuracy, robustness, and reliability of our model thanks to the promising uncertainty estimation and multi-view integration strategy.

2. RELATED WORK

Uncertainty-based Learning. Deep neural networks have achieved great success in various tasks. However since most deep models are essentially deterministic functions, the uncertainty of the model cannot be obtained. Bayesian neural networks (BNNs) (Denker & LeCun, 1991; MacKay, 1992b; Neal, 2012) Building upon RBF networks, the distance between test samples and prototypes can be used as the agency for deterministic uncertainty (van Amersfoort et al., 2020) . Benefiting from the learned weights of different tasks with homoscedastic uncertainty learning, (Kendall et al., 2018) achieves impressive performance in multi-task learning. Multi-View Learning. Learning on data with multiple views has proven effective in a variety of tasks. CCA-based multi-view models (Hotelling, 1992; Akaho, 2006; Wang, 2007; Andrew et al., 



endow deep models with uncertainty by replacing the deterministic weight parameters with distributions. Since BNNs struggle in performing inference and usually come with prohibitive computational costs, a more scalable and practical approach, MC-dropout (Gal & Ghahramani, 2016), was proposed. In this model, the inference is completed by performing dropout sampling from the weight during training and testing. Ensemble based methods(Lakshminarayanan et al., 2017)  train and integrate multiple deep networks and also achieve promising performance. Instead of indirectly modeling uncertainty through network weights, the algorithm(Sensoy et al., 2018)  introduces the subjective logic theory to directly model uncertainty without ensemble or Monte Carlo sampling.

