WEAKLY SUPERVISED KNOWLEDGE TRANSFER WITH PROBABILISTIC LOGICAL REASONING FOR OBJECT DE-TECTION

Abstract

Training object detection models usually requires instance-level annotations, such as the positions and labels of all objects present in each image. Such supervision is unfortunately not always available and, more often, only image-level information is provided, also known as weak supervision. Recent works have addressed this limitation by leveraging knowledge from a richly annotated domain. However, the scope of weak supervision supported by these approaches has been very restrictive, preventing them to use all available information. In this work, we propose ProbKT, a framework based on probabilistic logical reasoning that allows to train object detection models with arbitrary types of weak supervision. We empirically show on different datasets that using all available information is beneficial as our ProbKT leads to significant improvement on target domain and better generalization compared to existing baselines. We also showcase the ability of our approach to handle complex logic statements as supervision signal.

1. INTRODUCTION

Object detection is a fundamental ability of numerous high-level machine learning pipelines such as autonomous driving [4; 16] , augmented reality [42] or image retrieval [17] . However, training state-of-the-art object detection models generally requires detailed image annotations such as the boxcoordinates location and the labels of each object present in each image. If several large benchmark datasets with detailed annotations are available [26; 15] , providing such detailed annotation on new specific datasets comes with a significant cost that is often not affordable for many applications. More frequently, datasets come with only limited annotation, also referred to as weak supervision. This has sparked research in weakly-supervised object detection approaches [25; 6; 40] , using techniques such as multiple instance learning [40] or variations of class activation maps [3]. However, these approaches have been shown to significantly underperform their fully-supervised counterparts in terms of robustness and accurate localization of the objects [39] . An appealing and intuitive approach to improve the performance of weakly supervised object detection is to perform transfer learning from an existing object detection model pre-trained on a fully annotated dataset [14; 46; 43] . This approach, also referred to as transfer learning or domain adaptation, consists in leveraging transferable knowledge from the pre-trained model (such as bounding boxes prediction capabilities) to the new weakly supervised domain. This transfer has been embodied in different ways in the literature. Examples include a simple fine-tuning of the classifier of bounding box proposals of the pre-trained model [43] , or an iterative relabeling of the weakly supervised dataset for retraining a new full objects detection model on the re-labeled data [46] .

availability

Our code is available at https://github.com/molden/ProbKT 

