BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//wp-events-plugin.com//7.4.0.1//EN
TZID:Asia/Kolkata
X-WR-TIMEZONE:Asia/Kolkata
BEGIN:VEVENT
UID:199@cds.iisc.ac.in
DTSTART;TZID=Asia/Kolkata:20260605T100000
DTEND;TZID=Asia/Kolkata:20260605T110000
DTSTAMP:20260525T144058Z
URL:https://cds.iisc.ac.in/events/ph-d-thesis-colloquium-102-cds-05-june20
 26-towards-reliable-language-model-systems-for-educational-assessment-and-
 adaptive-learning/
SUMMARY:Ph.D: Thesis Colloquium: 102 : CDS: 05\, June2026 “Towards Reliab
 le Language Model Systems for Educational Assessment and Adaptive Learning
 ”
DESCRIPTION:DEPARTMENT OF COMPUTATIONAL AND DATA SCIENCES\nPh.D. Thesis Col
 loquium\n\n\n\nSpeaker: Ms. Nicy Scaria\nS.R. Number: 06-18-01-10-12-22-1-
 21645\nTitle: “Towards Reliable Language Model Systems for Educational A
 ssessment and Adaptive Learning”\nResearch Supervisor: Dr. Deepak Subram
 ani\nDate &amp\; Time : June 05\, 2026 (Friday)\, 10:00 AM\nVenue : #102\,
  CDS Seminar Hall\n\n\n\nABSTRACT\nDeploying language model systems in edu
 cation requires more than just fluent text generation or answer correctnes
 s. Such systems must support pedagogical alignment\, curriculum relevance\
 , reliable evaluation\, sound reasoning\, diagnosis of learner misconcepti
 ons and skill gaps\, and transparent mechanisms for learner progression. T
 hese requirements become especially important in resource-constrained educ
 ational contexts\, where teachers may have limited time to create high-qua
 lity assessments and learners may lack continuous expert feedback. This th
 esis investigates how language model systems can be designed\, evaluated\,
  and integrated into pedagogically grounded workflows for educational asse
 ssment and adaptive learning. The thesis is structured into three parts. T
 he first part studies automated educational question generation using Larg
 e Language Models (LLM)\, including curriculum-aligned question generation
 \, Bloom’s taxonomy-based prompting\, and structured knowledge-guided MC
 Q generation. The second part examines the reliability and educational uti
 lity of Small Language Models (SLM) for reasoning and learner assessment. 
 The final part develops and positions Learning in Blocks as a structured f
 ramework for personalized adaptive learning\, integrating rubric-aligned a
 ssessment\, diagnostic recommendation\, spaced review\, and mastery-based 
 progression.\n\nPART-I\nAutomated Educational Question Generation: The fir
 st part of the thesis focuses on Automated Educational Question Generation
  (AEQG) using LLMs. In many school systems\, including Indian high-school 
 social science education\, assessments often emphasize rote memorization r
 ather than higher-order cognitive skills. To address this limitation\, we 
 examine whether modern LLMs can generate curriculum-relevant and pedagogic
 ally sound questions across Bloom’s taxonomy. The work first studies que
 stion generation for the social science curriculum of an Indian state educ
 ational board and then extends the investigation to prompting strategies f
 or generating questions across cognitive levels more broadly. Expert evalu
 ation shows that LLMs can generate high-quality questions when provided wi
 th adequate context and instructions. However\, the results also reveal va
 riation across models of different sizes and show that automated evaluatio
 n is not yet on par with human expert judgment. These findings demonstrate
  that LLMs can support scalable assessment creation\, but their use in edu
 cational assessment requires careful prompt design\, pedagogical validatio
 n\, and human-grounded evaluation.\n\nStructured Knowledge-Guided MCQ Gene
 ration with Effective Distractors: The thesis further extends AEQG from ge
 neral question generation to structured assessment design. High-quality MC
 Qs must assess conceptual understanding\, target different cognitive level
 s\, and include plausible distractors that reflect common learner misconce
 ptions. Existing automated approaches often struggle to incorporate such d
 omain-specific misconceptions. To address this limitation\, we develop a h
 ierarchical concept map-based framework for generating MCQs in high-school
  physics. The framework represents major physics topics and their intercon
 nections through a structured concept map\, retrieves topic-relevant secti
 ons\, and provides this context to an LLM for question and distractor gene
 ration. Expert and student evaluations show that the concept map-guided ap
 proach outperforms baseline methods and generates questions that more effe
 ctively assess conceptual understanding. These results demonstrate that re
 liable AEQG requires not only powerful language models\, but also structur
 ed representations of domain knowledge and learner misconceptions.\n\nPART
 -II\nReliability of Small Language Models for Educational Reasoning: The s
 econd part of the thesis focuses on the reliability and educational utilit
 y of SLMs. SLMs are attractive for education because they offer efficiency
 \, privacy\, cost\, and deployability advantages\, but their usefulness de
 pends on whether they can reason and evaluate learner performance reliably
 . In learning contexts\, models that produce correct final answers through
  incorrect procedures may reinforce misconceptions and provide misleading 
 feedback. To study this issue\, we introduce PhysBench\, a benchmark of hi
 gh-school and AP-level physics questions with structured reference solutio
 ns\, Bloom’s taxonomy annotations\, and culturally contextualized varian
 ts. Using a stage-wise evaluation rubric\, we assess SLM responses to exam
 ine reasoning reliability\, failure modes\, and robustness under contextua
 l variations. The results show that many correct final answer solutions st
 ill contain reasoning errors\, demonstrating that answer accuracy alone is
  insufficient for evaluating educational AI systems.\n\nSLM-Based CEFR Spe
 aking Assessment: The thesis further examines the potential of SLMs for au
 tomated learner assessment when adapted using high-quality\, criterion-ali
 gned data. In language learning\, human evaluation of CEFR speaking assess
 ments creates scalability challenges in e-learning environments. To addres
 s this problem\, we develop EvalYaks\, a family of instruction-tuned model
 s for automated evaluation of CEFR B2 English speaking assessment transcri
 pts. The work evaluates open-source and commercial language models for CEF
 R-aligned scoring\, creates expert-validated synthetic conversational data
 sets\, and uses parameter-efficient instruction tuning to adapt Mistral fo
 r speaking assessment\, vocabulary-level identification and generation\, a
 nd text-level identification and generation. EvalYaks achieves performance
  competitive with frontier models\, and pilot validation on real-world lea
 rner transcripts verifies its transferability to practical assessment cont
 exts. This work demonstrates that carefully adapted SLMs can support scala
 ble language proficiency evaluation when trained with expert-validated\, c
 riterion-aligned data.\n\nPART-III\nLearning in Blocks for Personalized Ad
 aptive Language Learning: The final part of the thesis develops Learning i
 n Blocks\, an adaptive learning framework that connects learner assessment
  with targeted review and mastery-based progression. In digital language l
 earning\, learners can often advance through quiz-based curricula despite 
 persistent gaps in using grammar and vocabulary during interaction. To add
 ress this limitation\, Learning in Blocks grounds progression in demonstra
 ted conversational competence evaluated through CEFR-aligned rubrics. The 
 framework uses heterogeneous multi-agent debate to evaluate Grammar\, Voca
 bulary\, and Interactive Communication\, resolve conflicting judgments\, a
 nd identify specific grammar skills and vocabulary topics for targeted rev
 iew. Learners progress only after demonstrating mastery\, while spaced rev
 iew targets identify weaknesses to counter skill weakening. Expert-annotat
 ed conversation benchmarks and a learner study show that combining rubric-
 aligned scoring\, diagnostic recommendation\, spaced review\, and mastery-
 based progression improves learning outcomes.\n\nLearning in Blocks as a D
 esign Pattern for Adaptive Learning Systems: The thesis concludes by posit
 ioning Learning in Blocks as a broader design pattern for responsible lang
 uage model supported learning. Open-ended chatbot interfaces are flexible\
 , but they can make it difficult to constrain system behavior\, align inte
 ractions with curriculum goals\, and connect learner activity to demonstra
 ted skill performance. In contrast\, Learning in Blocks organizes learning
  into blocks of target and prerequisite skills\, where bounded pedagogical
  agents support assessment generation\, assessment evaluation\, diagnostic
  recommendation\, spaced review\, and mastery-based progression. Together\
 , these components define Learning in Blocks as a transparent\, auditable\
 , and pedagogically aligned adaptive learning framework supported by langu
 age model systems.\n\n\n\nALL ARE WELCOME
CATEGORIES:Events,Ph.D. Thesis Colloquium
END:VEVENT
BEGIN:VTIMEZONE
TZID:Asia/Kolkata
X-LIC-LOCATION:Asia/Kolkata
BEGIN:STANDARD
DTSTART:20250605T100000
TZOFFSETFROM:+0530
TZOFFSETTO:+0530
TZNAME:IST
END:STANDARD
END:VTIMEZONE
END:VCALENDAR