
- This event has passed.
Ph.D: Thesis Defense: HYBRID: CDS: 30, March 2023 “Self-Supervised Domain Adaptation Frameworks for Computer Vision Tasks”
30 Mar @ 4:30 PM -- 5:30 PM
DEPARTMENT OF COMPUTATIONAL AND DATA SCIENCES
__________________________________________________________________________________________
Speaker : Mr. Jogendra Nath Kundu
S.R. Number : 06-18-02-10-12-16-1-13876
Title : “Self-Supervised Domain Adaptation Frameworks for Computer Vision Tasks”
Research Supervisor: Prof. Venkatesh Babu R
Date & Time : March 30, 2023 (Thursday), 4:30 PM
Venue : The Thesis Défense will be held on HYBRID Mode # 102 CDS Seminar Hall /MICROSOFT TEAMS
Please click on the following link to join the Thesis Defense:
MS Teams link:
_____________________________________________________________________________________________________________
ABSTRACT
There is a strong incentive to build intelligent machines that can understand and adapt to changes in the visual world without human supervision. While humans and animals learn to perceive the world on their own, almost all state-of-the-art vision systems heavily rely on external supervision from millions of manually annotated training examples. Gathering such large-scale manual annotations for structured vision tasks, such as monocular depth estimation, scene segmentation, human pose estimation, faces several practical limitations. Usually, the annotations are gathered in two broad ways; 1) via specialized instruments (sensors) or laboratory setups, 2) via manual annotations. Both processes have several drawbacks. While human annotations are expensive, scarce, or error-prone; instrument-based annotations are often noisy or limited to specific laboratory environments. Such limitations not only stand as a major bottleneck in our efforts to gather unambiguous ground-truth but also limit the diversity in the collected labeled dataset. This motivates us to develop innovative ways to utilize synthetic environments to create labeled synthetic datasets with noise-free unambiguous ground-truths. However, the performance of models trained on such synthetic data markedly degrades when tested on real-world samples due to input distribution shift (a.k.a. domain shift). Unsupervised domain adaptation (DA) seeks learning techniques that can minimize the domain discrepancy between a labeled source and an unlabeled target. However, it mostly remains unexplored for challenging structured prediction based vision tasks.
Motivated by the above observations, my research focuses on addressing the following key aspects: (1) Developing algorithms that support improved transferability to domain and task shifts, (2) Leveraging inter-entity or cross-modal relationships to develop self-supervised objectives, and (3) Instilling natural priors to constrain the model output within the realm of natural distributions.
First, we present AdaDepth – an unsupervised domain adaptation (DA) strategy for the pixel-wise regression task of monocular depth estimation. Mode collapse is a common phenomenon observed during adversarial training in the absence of paired supervision. Without access to target depth-maps, we address this challenge using a novel content congruent regularization technique. In a follow-up work, we introduced UM-Adapt, a unified framework to address two distinct objectives in a multi-task adaptation framework, i.e., a) achieving balanced performance across all tasks and b) performing domain adaptation in an unsupervised setting. This is realized using two novel regularization strategies; Contour-based content regularization and exploitation of inter-task coherency using a novel cross-task distillation module. Moving forward, we identified certain key issues in existing domain adaptation algorithms that hinder their practical deployability to a large extent. Existing approaches demand the coexistence of source and target data, which is highly impractical in scenarios where data-sharing is restricted due to proprietary or privacy concerns. To address this, we propose a new setting termed as Source-Free DA and tailored learning protocols for the dense prediction task of semantic segmentation and image classification in the presence of category shift.
==================================================================
ALL ARE WELCOME