BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//wp-events-plugin.com//7.2.3.1//EN
TZID:Asia/Kolkata
X-WR-TIMEZONE:Asia/Kolkata
BEGIN:VEVENT
UID:170@cds.iisc.ac.in
DTSTART;TZID=Asia/Kolkata:20260105T113000
DTEND;TZID=Asia/Kolkata:20260105T123000
DTSTAMP:20260101T060825Z
URL:https://cds.iisc.ac.in/events/cds-kiac-seminar-cds-102-05th-january-th
 e-three-ps-of-modern-computer-vision-pixels-perception-and-physics/
SUMMARY:CDS-KIAC {Seminar} @ CDS: #102: 05th\, January "The Three P's of Mo
 dern Computer Vision: Pixels\, Perception\, and Physics"
DESCRIPTION:We welcome you to CDS-KIAC talk on 05th January 2026 (Monday). 
 The details are as below:\n\n\n\nSpeaker : Dr. Anand Bhattad\, Assistant P
 rofessor of Computer Science at Johns Hopkins University\nTitle : The Thre
 e P's of Modern Computer Vision: Pixels\, Perception\, and Physics\nDate a
 nd Time : January 05\, 2026: 11:30 AM\nVenue : #102\, CDS Seminar Hall.\n\
 n\n\nAbstract:\nFor decades\, computer vision has been guided by what Jite
 ndra Malik and colleagues called the "Three R's": Recognition (what is it?
 )\, Reconstruction (what is its 3D shape?)\, and Reorganization (what belo
 ngs together?). This framework drove extraordinary progress. However\, the
  rise of generative models opens a new frontier: moving from static descri
 ption to dynamic understanding. This talk presents a new paradigm I call t
 he Three P's: Pixels\, Perception\, and Physics.\n\nI will first begin wit
 h a puzzle: state-of-the-art models generate photorealistic images\, yet t
 heir outputs often contain impossible shadows\, broken perspective\, and o
 bjects that defy gravity\, violating principles established by Galileo fou
 r centuries ago. Are these "world models" or merely sophisticated pixel pa
 rrots?\n\nThe answer is more interesting than yes or no. Through systemati
 c probing\, I will demonstrate that generative models have learned fundame
 ntal scene properties—depth\, surface normals\, albedo\, and shading—w
 ithout explicit supervision\, recovering intrinsic representations that re
 searchers have pursued since Barrow and Tenenbaum's foundational 1978 work
 . These models perceive far more than they appear to. The deeper question 
 is: where exactly does understanding end and imitation begin?\n\nMy resear
 ch maps this boundary. I introduce Visual Jenga\, a new scene understandin
 g task that reveals implicit physical knowledge by testing stability when 
 objects are removed. I demonstrate how representing scenes as 3D primitive
 s enables geometric control that pixel-based editing cannot achieve\, and 
 how physics-based cues\, such as shadows\, can steer generation toward pla
 usibility. I conclude with a vision for the field: The Three R's taught ma
 chines to describe the visual world. The Three P's will teach them to unde
 rstand how it works.\n\nBio of Speaker:\nAnand Bhattad is an Assistant Pro
 fessor of Computer Science at Johns Hopkins University and a member of the
  Data Science and AI Institute. He leads the Pixels\, Perception\, and Phy
 sics (3P) Vision Group\, which focuses on building perception-driven and p
 hysics-aware visual models from raw pixel data. His research spans compute
 r vision\, computer graphics\, and computational photography. Prior to joi
 ning Hopkins\, he was a Research Assistant Professor at the Toyota Technol
 ogical Institute at Chicago and a visiting scholar at UC Berkeley. He earn
 ed his Ph.D. in Computer Science from the University of Illinois Urbana-Ch
 ampaign\, advised by David Forsyth.\n\nHost Faculty: Prof. Venkatesh Babu 
 \, CDS\n\n\n\nALL ARE WELCOME
CATEGORIES:Events,Talks
END:VEVENT
BEGIN:VTIMEZONE
TZID:Asia/Kolkata
X-LIC-LOCATION:Asia/Kolkata
BEGIN:STANDARD
DTSTART:20250105T113000
TZOFFSETFROM:+0530
TZOFFSETTO:+0530
TZNAME:IST
END:STANDARD
END:VTIMEZONE
END:VCALENDAR