TRUSTED MULTI-VIEW CLASSIFICATION

Abstract

Multi-view classification (MVC) generally focuses on improving classification accuracy by using information from different views, typically integrating them into a unified comprehensive representation for downstream tasks. However, it is also crucial to dynamically assess the quality of a view for different samples in order to provide reliable uncertainty estimations, which indicate whether predictions can be trusted. To this end, we propose a novel multi-view classification method, termed trusted multi-view classification, which provides a new paradigm for multi-view learning by dynamically integrating different views at an evidence level. The algorithm jointly utilizes multiple views to promote both classification reliability and robustness by integrating evidence from each view. To achieve this, the Dirichlet distribution is used to model the distribution of the class probabilities, parameterized with evidence from different views and integrated with the Dempster-Shafer theory. The unified learning framework induces accurate uncertainty and accordingly endows the model with both reliability and robustness for out-ofdistribution samples. Extensive experimental results validate the effectiveness of the proposed model in accuracy, reliability and robustness.

1. INTRODUCTION

Multi-view data, typically associated with multiple modalities or multiple types of features, often exists in real-world scenarios. State-of-the-art multi-view learning methods achieve tremendous success across a wide range of real-world applications. However, this success typically relies on complex models (Wang et al., 2015a; Tian et al., 2019; Bachman et al., 2019; Zhang et al., 2019; Hassani & Khasahmadi, 2020) , which tend to integrate multi-view information with deep neural networks. Although these models can provide accurate classification results, they are usually vulnerable to yield unreliable predictions, particularly when presented with views that are not well-represented (e.g., information from abnormal sensors). Consequently, their deployment in safety-critical applications (e.g., computer-aided diagnosis or autonomous driving) is limited. This has inspired us to introduce a new paradigm for multi-view classification to produce trusted decisions. For multi-view learning, traditional algorithms generally assume an equal value for different views or assign/learn a fixed weight for each view. The underlying assumption is that the qualities or importance of these views are basically stable for all samples. In practice, the quality of a view often varies for different samples which the designed models should be aware of for adaption. For example, in multi-modal medical diagnosis (Perrin et al., 2009; Sui et al., 2018) , a magnetic resonance (MR) image may be sufficient for one subject, while a positron emission tomography (PET) image may be required for another. Therefore, the decision should be well explained according to multi-view inputs. Typically, we not only need to know the classification result, but should also be able to answer

