DECOUPLING CONCEPT BOTTLENECK MODEL

Abstract

Concept Bottleneck Model (CBM) is a kind of powerful interpretable neural network, which utilizes high-level concepts to explain model decisions and interact with humans. However, CBM cannot always work as expected due to the troublesome collection and commonplace insufficiency of high-level concepts in realworld scenarios. In this paper, we theoretically reveal that insufficient concept information will induce the mixture of explicit and implicit information, which further leads to the inherent dilemma of concept and label distortions in CBM. Motivated by the proposed theorem, we present Decoupling Concept Bottleneck Model (DCBM), a novel concept-based model decoupling heterogeneous information into explicit and implicit concepts, while retaining high prediction performance and interpretability. Extensive experiments expose the success in the alleviation of concept/label distortions, where DCBM achieves state-of-the-art performance in both concept and label learning tasks. Especially for situations where concepts are insufficient, DCBM significantly outperforms other models based on concept bottleneck. Moreover, to express effective human-machine interactions for DCBM, we devise two algorithms based on mutual information (MI) estimation, including forward intervention and backward rectification, which can automatically correct labels and trace back to wrong concepts. The construction of the interaction regime can be formulated as a light min-max optimization problem achieved within minutes. Multiple experiments show that such interactions can effectively promote concept/label accuracy.



Concept Bottleneck Model (CBM) (Koh et al., 2020; Losch et al., 2021) is an interactive and interpretable AI system, which encodes the prior expert knowledge into the neural network and makes decisions according to corresponding concepts. In detail, CBM firstly maps input images into corresponding high-level concepts, and then utilizes these concepts for downstream tasks. Moreover, CBM gives the ante-hoc explanations for the model's predictions due to its end-to-end training regimes and has been widely applied to healthcare (Chen et al., 2021; Rong et al., 2022 ), shift detection (Wijaya et al., 2021 ), algorithmic reasoning (Xuanyuan et al., 2022) , and so on. However, high-level concepts in real-world scenarios often suffer a troublesome collection and commonplace insufficiency due to the consumption of large amounts of resources. Thus, the concept information is usually not sufficient to recover label information. In this circumstance, CBM cannot fit the ground-truth labels only according to the clean but insufficient concepts. On the contrary, to learn a better classifier, CBM has to inject extra information into the concept layer, i.e., sacrifice the concept accuracy (Fig. 1 ).

What kind of bird is this?

It is a Groove-Billed Ani for its black wings, solid breasts, plain head, … But I think its head is not plain. Inspired by such observations, we theoretically reveal that CBM cannot avoid the inherent trade-off between concept and label distortions due to the mixture of explicit/implicit information. Motivated by the theoretical results, we propose Decoupling Concept Bottleneck Model (DCBM), a novel concept-based model, to decouple the heterogeneous information into explicit and implicit concepts. In detail, DCBM automatically allocates the implicit concepts to auxiliary neurons, as such the pollution of explicit concepts can be avoided. Furthermore, DCBM maximizes the interpretability of the models via the Jensen-Shannon (JS) divergence constraint during training. Equipped with the DCBM, we aim to conduct corresponding human-machine interactive tasks, regarded as one of the most important components for such concept-based models. However, the design of interaction algorithms for DCBM is not straightforward due to the correlation between explicit and implicit concepts. To this end, we propose a phase-two decoupling algorithm for DCBM to peel out the explicit information via mutual information (MI) estimation (Belghazi et al., 2018) . We formulate it into a light min-max optimization problem solved in minutes. After the phase-two decoupling stage, we consider two interaction tasks, including forward intervention and backward rectification, which can correct labels and trace back to wrong concepts automatically via humanmachine interaction (Fig. 2 ). In summary, we make the following contributions: • To our best knowledge, we are the first to theoretically reveal and prove the existence of inherent concept/label distortions in CBM for the situation where high-level concepts are insufficient. • Motivated by the theoretical results, we propose Decoupling Concept Bottleneck Model (DCBM), a novel concept-based framework, to alleviate the distortions by automatically decoupling the concepts into explicit and implicit ones during the training process. • Extensive experiments show that DCBM achieves state-of-the-art results in both concept and label learning tasks. In particular, compared with other concept-based models, the concept/label distortions can be alleviated conspicuously for DCBM with inadequate concepts. • We devise a novel comprehensive human-machine system, which can automatically correct labels and ascertain wrong concepts via feedback from human experts. The system is designed to decouple heterogeneous information by solving a light MI-based optimization problem within minutes.

2. THEORETICAL RESULTS

We firstly introduce Concept Bottleneck Model (CBM) (Koh et al., 2020) , and then present the dilemma of concept/label distortions in CBM theoretically when the concepts are insufficient.

2.1. CONCEPT BOTTLENECK MODEL (CBM)

Given N training triples {x n , c n , y n } N n=1 , where x n ∈ R D , c n ∈ R d , y n ∈ R k respectively indicate input, concept, and label vector. With g : R D → R d mapping from input space into concept space and f : R d → R k mapping from concept space into label space, CBM defines two loss functions, i.e., L Y and L C , which measure the difference between the ground-truth labels (concepts)



Figure 1: Concept/label distortions in CBM. (a) is the main pipeline in CBM. (b) CBM expects an ideal case with all explicit concepts. (c) Heterogeneous concepts in real-world scenarios. (d) Thus, CBM has to mix explicit/implicit concepts to achieve both concept/label learning.

Figure2: Two human-machine interactive tasks in DCBM. Forward Intervention: correcting labels according to the more accurate concepts given by human; Backward Rectification: tracking back to wrong concepts according to the labels updated by human.

