IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2020
Abstract: Deep neural networks can easily be fooled by an adversary using minuscule perturbations to input images. The existing defense techniques suffer greatly under white-box attack settings, where an adversary has full knowledge about the network and can iterate several times to find strong perturbations. We observe that the main reason for the existence of such vulnerabilities is the close proximity of different class samples in the learned feature space of deep models. This allows the model decisions to be totally changed by adding an imperceptible perturbation in the inputs. To counter this, we propose to class-wise disentangle the intermediate feature representations of deep networks specifically forcing the features for each class to lie inside a convex polytope that is maximally separated from the polytopes of other classes. In this manner, the network is forced to learn distinct and distant decision regions for each class. We observe that this simple constraint on the features greatly enhances the robustness of learned models, even against the strongest white-box attacks, without degrading the classification performance on clean images. We report extensive evaluations in both black-box and white-box attack scenarios and show significant gains in comparison to state-of-the-art defenses.
International Conference on Computer Vision (ICCV) 2019
Abstract: Deep neural networks are vulnerable to adversarial attacks, which can fool them by adding minuscule perturbations to the input images. The robustness of existing defenses suffers greatly under white-box attack settings, where an adversary has full knowledge about the network and can iterate several times to find strong perturbations. We observe that the main reason for the existence of such perturbations is the close proximity of different class samples in the learned feature space. This allows model decisions to be totally changed by adding an imperceptible perturbation in the inputs. To counter this, we propose to class-wise disentangle the intermediate feature representations of deep networks. Specifically, we force the features for each class to lie inside a convex polytope that is maximally separated from the polytopes of other classes. In this manner, the network is forced to learn distinct and distant decision regions for each class. We observe that this simple constraint on the features greatly enhances the robustness of learned models, even against the strongest white-box attacks, without degrading the classification performance on clean images. We report extensive evaluations in both black-box and white-box attack scenarios and show significant gains in comparison to state-of-the art defenses
IEEE Transactions on Image Processing (TIP) 2020
Abstract: Convolutional Neural Networks have achieved significant success across multiple computer vision tasks. However, they are vulnerable to carefully crafted, human imperceptible adversarial noise patterns which constrain their deployment in critical security-sensitive systems. This paper proposes a computationally efficient image enhancement approach that provides a strong defense mechanism to effectively mitigate the effect of such adversarial perturbations. We show that the deep image restoration networks learn mapping functions that can bring off-the-manifold adversarial samples onto the natural image manifold, thus restoring classifier beliefs towards correct classes. A distinguishing feature of our approach is that, in addition to providing robustness against attacks, it simultaneously enhances image quality and retains models performance on clean images. Furthermore, the proposed method does not modify the classifier or requires a separate mechanism to detect adversarial images. The effectiveness of the scheme has been demonstrated through extensive experiments, where it has proven a strong defense in both white-box and black-box attack settings. The proposed scheme is simple and has the following advantages: (1) it does not require any model training or parameter optimization, (2) it complements other existing defense mechanisms, (3) it is agnostic to the attacked model and attack type and (4) it provides superior performance across all popular attack algorithms.
Digital Image Computing: Techniques and Applications (DICTA) 2018
Abstract: In this paper, we introduce a new dataset for student engagement detection and localization. Digital revolution has transformed the traditional teaching procedure and a result analysis of the student engagement in an e-learning environment would facilitate effective task accomplishment and learning. Well known social cues of engagement/disengagement can be inferred from facial expressions, body movements and gaze pattern. In this paper, student's response to various stimuli videos are recorded and important cues are extracted to estimate variations in engagement level. In this paper, we study the association of a subject's behavioral cues with his/her engagement level, as annotated by labelers. We then localize engaging/non-engaging parts in the stimuli videos using a deep multiple instance learning based framework, which can give useful insight into designing Massive Open Online Courses (MOOCs) video material. Recognizing the lack of any publicly available dataset in the domain of user engagement, a new `in the wild' dataset is created to study the subject engagement problem. The dataset contains 195 videos captured from 78 subjects which is about 16.5 hours of recording. We present detailed baseline results using different classifiers ranging from traditional machine learning to deep learning based approaches. The subject independent analysis is performed so that it can be generalized to new users. The problem of engagement prediction is modeled as a weakly supervised learning problem. The dataset is manually annotated by different labelers for four levels of engagement independently and the correlation studies between annotated and predicted labels of videos by different classifiers is reported. This dataset creation is an effort to facilitate research in various e-learning environments such as intelligent tutoring systems, MOOCs, and others.
International Conference on Affective Computing and Intelligent Interaction (ACII) 2017
Abstract: Automated facial video analysis is useful in numerous health care applications. For example, spatio-temporal analysis of such videos has been previously done for assisting clinicians in the diagnosis of depression. Physiological measures, such as an individual's heart rate, provide very important cues to understand a person's mental health. Unobtrusively estimated heart rate has not been previously used to analyse individuals' mental health. In this paper, we automatically estimate heart rate activity from facial videos. We then study the association of the estimated heart rate activity with the person's mental health, as diagnosed by clinicians. Specifically, from the heart rate activity in response to watching different movies, we classify individuals as either depressed or healthy. The efficacy of the proposed scheme is demonstrated by experimental evaluations on a clinically validated dataset. Our results suggest unobtrusively estimated heart rate to be very effective for depression analysis.
Leveraging the use of semi-supervised learing for image to image translation. .
Working on making deep neural networks robust against adversarial attacks.
Worked on prediction and localization of student engagement in response to a stimuli video (e-learning environment) from facial expressions using Deep Multi-Instance Learning.
Estimation of Heart rate of different individuals and its variations over the span of video from their facial videos by extracting plethysmograph (PG) signals from green channel of the frames. Considering heart rate as extracted feature, individuals are classified into two categories - healthy controls and depressed patients using a linear SVM classifier.
I did my bachelors degree in Electronics and Communication Engineering from National Institute of Technology (NIT), Srinagar, India. During my under-grad I did a couple of research intersnhips at University of Canberra, Australia and Indian Institute of Technology (IIT) Ropar. Later I am worked as a computer vision research intern at Inception Institute of Artificial Intelligence (IIAI), Abu Dhabi, UAE for a year. Currently I am working as a Research Assistant/ PhD candidate at University of Cambridge, UK on Applications of Machine/Deep Learning in Computer Graphics.