- This event has passed.
M.Tech Research: Thesis Defense: ONLINE: CDS: 10, October 2022 “Landmark Estimation and Image Synthesis Guidance using Self-Supervised Networks.”
10 Oct @ 3:00 PM -- 4:00 PM
DEPARTMENT OF COMPUTATIONAL AND DATA SCIENCES
Speaker : Mr. Tejan Naresh Naik Karmali
S.R. Number : 06-18-02-10-22-19-1-16613
Title : “Landmark Estimation and Image Synthesis Guidance using Self-Supervised Networks.”
Research Supervisor: Prof. Venkatesh Babu
Date & Time : October 10, 2022 (Monday), 03:00 PM
Venue : #102, CDS Seminar Hall
In this first part, we demonstrate the emergent correspondence tracking properties in the non-contrastive SSL framework. Using this as supervision, we propose LEAD which is an approach to discover landmarks from an unannotated collection of category-specific images. Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image, which are further used to learn landmarks in a semi-supervised manner. While there have been advances in self-supervised learning of image features for instance-level tasks like classification, these methods do not ensure dense equivariant representations. The property of equivariance is of interest for dense prediction tasks like landmark estimation. In this work, we introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion. We follow a two-stage training approach: first, we train a network using the BYOL objective which operates at an instance level. The correspondences obtained through this network are further used to train a dense and compact representation of the image using a lightweight network. We show that having such a prior in the feature extractor helps in landmark detection, even under a drastically limited number of annotations while also improving generalization across scale variations.
Next, we utilize the rich feature space from the SSL framework as a “naturalness” prior to alleviate unnatural image generation from Generative Adversarial Networks (GAN), which is a popular class of generative models. Progress in GANs has enabled the generation of high-resolution photorealistic images of astonishing quality. StyleGANs allow for compelling attribute modification on such images via mathematical operations on the latent style vectors in the W/W+ space that effectively modulates the rich hierarchical representations of the generator. Such operations have recently been generalized beyond mere attribute swapping in the original StyleGAN paper to include interpolations. In spite of many significant improvements in StyleGANs, they are still seen to generate unnatural images. The quality of the generated images is a function of, (a) richness of the hierarchical representations learned by the generator, and, (b) linearity and smoothness of the style spaces. In this work, we propose Hierarchical Semantic Regularizer (HSR) which aligns the hierarchical representations learnt by the generator to corresponding powerful features learned by pretrained networks on large amounts of data. HSR not only improves generator representations but also the linearity and smoothness of the latent style spaces, leading to the generation of more natural-looking style-edited images. To demonstrate improved linearity, we propose a novel metric – Attribute Linearity Score (ALS). A significant reduction in the generation of unnatural images is corroborated by improvement in the Perceptual Path Length (PPL) metric by 15% across different standard datasets while simultaneously improving the linearity of attribute-change in the attribute editing tasks.