TRIPLET SIMILARITY LEARNING ON CONCORDANCE CONSTRAINT

Abstract

Triplet-based loss functions have been the paradigm of choice for robust deep metric learning (DML). However, conventional triplet-based losses require carefully tuning a decision boundary, i.e., violation margin. When performing online triplet mining on each mini-batch, choosing a good global and constant prior value for violation margin is challenging and irrational. To circumvent this issue, we propose a novel yet efficient concordance-induced triplet (CIT) loss as an objective function to train DML models. We formulate the similarity of triplet samples as a concordance constraint problem, then directly optimize concordance during DML model learning. Triplet concordance refers to the predicted ordering of intra-class and inter-class similarities being correct, which is invariant to any monotone transformation of the decision boundary of triplet samples. Hence, our CIT loss is free from the plague of adopting the violation margin as a prior constraint. In addition, due to the high training complexity of triplet-based losses, we introduce a partial likelihood term for CIT loss to impose additional penalties on hard triplet samples, thus enforcing fast convergence. We extensively experiment on a variety of DML tasks to demonstrate the elegance and simplicity of our CIT loss against its counterparts. In particular, on face recognition, person re-identification, as well as image retrieval datasets, our method can achieve comparable performances with state-of-the-arts without tuning any hyper-parameters laboriously.



The performance of the same task exhibits a significant difference by setting different violation margins. And Circle loss with the same violation margin varies from superior to inferior on various tasks. For circumventing this issue, Angular loss is proposed to push the negative point away from the center of the positive cluster and drag the positive points closer to each other by constraining



learning (DML) for visual understanding tasks, e.g., face recognition Schroff et al. (2015); Taigman et al. (2014), person re-identification (ReID) Shi et al. (2016); Ustinova & Lempitsky (2016), image retrieval Fang et al. (2021); Revaud et al. (2019), aims at learning embedding representations of images with class-level labels by a ranking loss function Kaya & Bilge (2019); Sohn (2016); Wang et al. (2017). There are two representative ranking loss functions developed for DML to minimize between-class similarity and maximize within-class similarity, i.e., pair-based loss Sun et al. (2014) and triplet-based loss Zhao et al. (2019). Compared to pairwise constraints, the optimization pattern of triplet-based losses additionally captures the relative similarity information, thus yielding impressive performances Liang et al. (2021); Zhuang et al. (2016). With triplet constraints, images from the same class are projected into neighboring embedding spaces, and images with different semantic contexts are mapped apart. However, under such an optimization objective, triplet-based losses suffer from following two problems when training DML models with the stochastic gradient descent (SGD) algorithm and sampling triplets within a mini-batch. •Irrational to set an absolute margin. Triplet constraint relies on a decision boundary to partition the embedding space of intra-class and inter-class, i.e., violation margin for reinforcing optimization Wang et al. (2018a;b). However, the violation margin is sensitive to scale change, and choosing an identical absolute value for clusters in different scales of intra-class variation is inappropriate Wang et al. (2017). Hence, triplet-based losses need to regulate this hyper-parameter attentively to impose appropriate penalty strength Qian et al. (2019); Sun et al. (2020). The performance of Circle loss Sun et al. (2020) on the varying circular decision boundary can prove such a claim.

