Loading Events

« All Events

  • This event has passed.

Ph.D: Thesis Defense: Online: CDS: 15, March 2023 “Methods for Improving Data-efficiency and Trustworthiness using Natural Language Supervision”

15 Mar @ 2:00 PM -- 3:00 PM


Ph.D. Thesis Defense (Online)


Speaker                 : Mr. Sawan Kumar

S.R. Number         : 06-18-02-10-12-17-1-15167

Title                       : “Methods for Improving Data-efficiency and Trustworthiness using Natural Language Supervision”

Research Supervisor: Prof. Partha Pratim Talukdar

Date & Time         : 15th March 2023 (Wednesday),02:00 PM

Venue                     : The Thesis Defense will be held on MICROSOFT TEAMS

Please click on the following link to join the Thesis Defense:

MS Teams link:






Traditional strategies to build machine learning based classification  systems employ discrete labels as targets. This limits the usefulness of  such systems in two ways. First, the generalizability of these systems  is limited to labels present and well represented in the training data.  Second, with increasingly larger neural network models gaining  acceptability, supervision with discrete labels alone does not lead to a  straightforward interface for generating explanations for the decisions  taken by such systems. Natural Language (NL) Supervision (NLS), in the  form of task descriptions, examples, label descriptions and explanations  for labelling decisions, provides a way to overcome these bottlenecks.  Working in this paradigm, we propose novel methods for improving  data-efficiency and trustworthiness:

(1) Data Efficiency using NLS: Word Sense Disambiguation (WSD) using  Sense Definition Embeddings


WSD, a long-standing open problem in Natural Language Processing (NLP),  typically presents itself with small training corpora with long tails of  label distributions. Existing supervised methods didn’t generalize well  to rare or unseen classes while NL supervision based systems did worse  on overall (standard) evaluation benchmarks. We propose Extended WSD  Incorporating Sense Embeddings (EWISE), a supervised model to perform  WSD by predicting over a continuous sense embedding space as opposed to  a discrete label space. This allows EWISE to generalize over both seen  and unseen senses, thus achieving generalized zero-shot learning. To  obtain target sense embeddings, EWISE utilizes NL sense definitions  along with external knowledge in WordNet relations. EWISE achieved new  state-of-the-art WSD performance at the time of publication,  specifically by improving on zero-shot and few-shot learning.

 (2) Trustworthiness using NLS: Natural Language Inference (NLI) with  Faithful NL Explanations

Generated NL explanations are expected to be faithful, i.e., they should  correlate well with the model’s internal decision making. In this work,  we focus on the task of NLI and address the following question: can we  build NLI systems which produce labels with high accuracy, while also  generating faithful explanations of its decisions? We propose  Natural-language Inference over Label-specific Explanations (NILE), a  novel NLI method which utilizes auto-generated label-specific NL  explanations to produce a label along with its faithful explanation. Our  evaluation of NILE also supports the claim that accurate systems capable  of providing testable explanations of their decisions can be designed.


(3) Improving the NLS interface of Large Language Models (LLM)

 LLMs, pre-trained on unsupervised corpora, have proven to be successful  as zero-shot and few-shot learners on downstream tasks using only a  textual interface. This enables a promising NLS interface. A typical  usage involves augmenting an input example along with some priming text  comprising of task descriptions and training examples and processing the  output probabilities to make predictions. In this work, we further  explore priming-based few-shot learning and make the following  contributions:

(a) Reordering Examples Helps during Priming-based Few-Shot Learning: We  show that presenting training examples in the right order is key for  generalization. We introduce PERO (Prompting with Examples in the Right  Order), where we formulate few-shot learning as search over the set of  permutations of the training examples. We demonstrate the effectiveness  of the proposed method on the tasks of sentiment classification, natural  language inference and fact retrieval. We show that PERO can learn to  generalize efficiently using as few as 10 examples, in contrast to  existing approaches.

 (b) Answer-level Calibration (ALC) Helps Free-form Multiple Choice  Question Answering (QA): We consider the QA format, where we need to  choose from a set of free-form textual choices of unspecified lengths,  given a context. We present ALC, where our main suggestion is to model  context-independent biases in terms of the probability of a choice  without the associated context and to subsequently remove these biases  using an unsupervised estimate of similarity with the full context. ALC  improves zero-shot and few-shot performance on several benchmarks while  also providing a more reliable estimate of performance.




15 Mar
2:00 PM -- 3:00 PM
Event Category: