Yogesh Simmhan is an Associate Professor in the Department of Computational and Data Sciences and a Swarna Jayanti Fellow at the Indian Institute of Science, Bangalore. His research explores scalable software platforms, algorithms and applications on distributed systems. These span Cloud and Edge Computing, Temporal Graph Processing, and Distributed storage and machine learning to support emerging Big Data and Internet of Things (IoT) applications. He has published over 100 peer-reviewed papers, and won the Best Paper Award at IEEE International Conference on Cloud Computing (CLOUD) 2019, IEEE TCSC SCALE Challenge Award in 2019 and 2012, the Distinguished Paper award at EuroPar 2018, and the IEEE/ACM Supercomputing HPC Storage Challenge Award in 2008. He is the recipient of the IEEE TCSC Award for Excellence in Scalable Computing (Mid Career Researcher) in 2020. He is an Associate Editor-in-Chief of the Journal of Parallel and Distributed Systems (JPDC), an Associate Editor of Future Generation Computing System (FGCS), and earlier served as an Associate Editor of IEEE Transactions on Cloud Computing and a member of the IEEE Future Directions Initiative on Big Data.

Yogesh has a Ph.D. in Computer Science from Indiana University, Bloomington, and was previously a Research Assistant Professor at the University of Southern California (USC), Los Angeles, and a Postdoc at Microsoft Research, San Francisco. He is a Distinguished Member of ACM, a Distinguished Contributor of the IEEE Computer Society and serves on the ACM India Executive Council.

My research is on distributed and scalable data platforms to support Big Data, Internet of Things (IoT), UAV and computer vision applications on novel computing infrastructure, such as Clouds and Edge devices. I lead the DREAM:Lab - Distributed Research on Emerging Applications and Machines - at CDS.

We have open positions for Ph.D. students, postdocs and Research and Development staff in our group to work on some of these exciting projects! Candidates should have expertise in Big Data platforms, Edge/Cloud Computing and Applied Machine Learning, with strong programming, algorithms and systems skills. Research students need to apply to the research degree admissions at the CDS department at IISc, and choose the DREAM:Lab as one of your lab choices. See here for staff position details.

Some active research areas are:

  • Temporal Graphs: Platforms, Algorithms and Analytics Graphs that show structure and properties variation across time are common, yet less examined in literature.
    •   We have developed a novel Interval-centric Computing Model (ICM) [ICDE2020, EuroSys2022] that allows time-respecting and time-independent algorithms to be defined over temporal graphs. Graphite is its scalable implementation over Apache Giraph. Over 10 graphs algorithms have been mapped to ICM, and Graphite scaled to graphs with over 130M interval vertices and 5.5B interval edges on a 8-node commodity cluster. We are also examining optimizations to improve the performance through windowing approaches and for incremental execution.
    •   We have explored low-latency path queries over temporal property graphs, which has been published as the Granite system [CCGRID2020,JPDC2021], with a novel query cost model to optimize distributed execution.
    •   There are several new and ongoing projects related to temporal graphs: scalable training of Graph Neural Networks (GNN), incremental computing over temporal and streaming graph updates, memory-efficient out-of-core and window-based graph processing, streaming partitioning of large graphs to conserve local community structures, and temporal graph centrality methods to identify high risk population using COVID-19 contact trace networks as part of the GoCoronaGo project.
    •   We are also exploring high-performance temporal and streaming graph analytics as part of the National Supercomputing Mission, jointly with IIT-H and IIIT-H, with an emphasis on parallel algorithmic patterns and application resiliency.
    •   In the past, we have also examined the use of cloud elasticity to scale graph processing [CLOUD2019] and subgraph-centric processing of temporal graphs [IPDPS2015], besides a survey on scalable graph processing frameworks [CSUR2018].
  • Distributed Analytics and Storage on Edge, Cloud and Drones Edge and Fog computing resources are an emerging computing paradigms, with their availability growing as part of IoT deployments. Edge devices like Jetson also have on-board GPU accelerators. Our research explores analytics and storage platforms over edge, fog and cloud resources, including fleets of drones/UAVs.
    •   Federated learning over edge devices is an important problem, given the wide-spread availability of accelerated edge and mobile/smart-phone devices, and their collocation with video data sources. Our emphasis on the systems aspect of federated learning, such as the optimization of individual edge accelerators for training [PAISE2022], scheduling and orchestration of the deep models to efficiently utilize 100s of edge devices and accelerators in a wide-area network to trade-off accuracy, resiliency performance, and privacy.
    •   Anveshak is a domain specific model and platform for distributed video analytics, which trades-off scalability, accuracy and latency when running DNN models on edge, fog and cloud resources[TPDS2021]. It won the IEEE TCSC SCALE challenge in 2019 [SCALE2019].
    •   An active area of interest is on computing, data management and scheduling for autonomous aerial vehicles (UAV) or Drones [INFOCOM2021]. Open problems include UAV routing for complex missions; where to schedule machine learning models for execution across UAV and backend; balancing compute, network and energy capacity against application deadlines in the context of 5G communications; and computer vision and tracking algorithms to use drones to assist the visually impaired.
    •   ElfStore is a distributed storage platform for the edge, that is designed based on P2P and HDFS concepts [ICWS2019]. We are currently examining consistency models, caching strategies and mobility of edge devices for distributed storage. We also also exploring storing and querying over time-series data using distributed edge devices [EUROPAR2020].
    •   With the growing availability of large-scale video data from city-scale camera networks, drone cameas and intelligent deep models to perform inferencing over them, there is a critical need for NoSQL databases to manage large video respoitories. We are exploring distributed video storage and querying systems with native query capabilities for inferencing using DNNs and spatio-temporal characteristics and in a privacy-preserving manner. These should also leverage edge accelerators that may be available, with trade-offs between a priori indexing and inferencing at ingest time, and on-demand inferencing at query time.
    •   Platforms for large IoT and edge deployments are difficult to validate due to lack of access to edge clusters with 1000s of devices. We developed the VIoLET container-based emulation environment for deploying large-scale edge and fog testbeds on which to validate these platforms [EUROPAR2018,TCPS2021]. We are extending this to Ultra-VIoLET and CORNET, which will support diverse network configuration, device mobility and energy constraints, and coupling the computing and network models with physical system simulators such as Gazego and SUMO for drone, robot and vehicle mobility [COMSNET2020,COMSNET2022].
    •   I coordinate the IBM-IISc Hybrid Cloud lab, a collaboration between faculty at IISc and researchers at IBM to explore the role of AI and verification in the efficient management of distributed information, data center operations and microservices within hybrid cloud and edge.
    •   In the past, we have also examined dataflow execution engines [ICSOC2017], dataflow scheduling [TCPS2017, CCGRID2018, CCGRID2022] and have a survey article on scheduling on edge, fog and cloud resources [SPE2019].
  • Scalable Data Management and Analytics for Science and Society We engage with our science and engineering collaborators on multi-disciplinary projects of social and scientific impact.
    •   In this era of COVID-19, our team has developed the GoCoronaGo Contact Tracing App for federated collection of Bluetooth-based proximity data at the institutional scale [JIISC2020]. Various temporal graph techniques are used to assign contact risk scores it users, to help with preventive measures and to perform digital contact tracing if a COVID case is found. This is being deployed at the IISc campus.
    •   SATVAM is a Indo-US project on low-cost air quality monitoring in urban spaces, with IIT-K, IIT-B and Duke University. Our group is examining means for autonomous monitoring and management of the IoT fabric, and machine learning models to enhance the calibration of low-cost commodity sensors to enhance their accuracy [ESCIENCE2019,AMT2021].
    •   The Genome India Project is a new pan-India initiative for next generation genome sequencing of 20,000 subjects. We are part of a 20+ consortium, led by the Center for Brain Research at IISc. We are investigating reliable, scalable and affordable storage and management of the sequencing data, and graph-based analytics over it [HIPCW2019].
    •   EQWATER is a project supported by the IMPRINT program to ensure equitable supply of water in mega-cities. We are exploring network-alaytics for optimizing supply schedules and management of data from field devices. In the past, we have proposed an IoT software architecture for data-driven smart city utilities [SPE2018].

ORCID: 0000-0003-4140-7774 | Google Scholar | DBLP

Recent publications since 2020 are listed below. See here for all publications

  1. Animesh Baranawal and Yogesh Simmhan, Optimizing the Interval-centric Distributed Computing Model for Temporal Graph Algorithms, European Conference on Computer Systems (EuroSys), 2022, (Artifact Functional Badge)
  2. Prateeksha Varshney, Shriram Ramesh, Shayal Chhabra, Aakash Khochare and Yogesh Simmhan, Resilient Execution of Data-triggered Applications on Edge, Fog and Cloud Resources, IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2022
  3. Prashanthi S K, Aakash Khochare, Sai Anuroop Kesanapalli, Rahul Bhope and Yogesh Simmhan, Don't Miss the Train: A Case for Systems Research into Training on the Edge, Workshop on Parallel AI and Systems for the Edge (PAISE), collocated with IPDPS, 2022
  4. Srikrishna Acharya, Bharadwaj Amrutur, Mukunda Bharathesa and Yogesh Simmhan, CORNET 2.0: A Co-Simulation Middleware for Robot Networks, International Conference on COMmunication Systems & NETworkS (COMSNETS), 2022, 10.1109/COMSNETS53615.2022.9668501
  5. Shriram Ramesh, Animesh Baranawal, and Yogesh Simmhan Granite: A Distributed Engine for Scalable Path Queries over Temporal Property Graphs, Journal of Parallel and Distributed Computing (JPDC), Vol. 151, Pages 94-111, May 2021, 10.1016/j.jpdc.2021.02.004, (CORE A*)
  6. Aakash Khochare, Aravindhan Krishnan, and Yogesh Simmhan A Scalable Platform for Distributed Object Tracking across a Many-camera Network, IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 32, Pages 1479-1493, June 2021, 10.1109/TPDS.2021.3049450 (CORE A*)
  7. Aakash Khochare, Yogesh Simmhan, Francesco Betti Sorbelli and Sajal K. Das Heuristic Algorithms for Co-scheduling of Edge Analytics and Routes for UAV Fleet Missions, IEEE International Conference on Computer Communications (INFOCOM), 2021, 10.1109/INFOCOM42981.2021.9488740 (CORE A*)
  8. Shrey Baheti, Parwat Singh Anjana, Sathya Peri and Yogesh Simmhan, DiPETrans: A Framework for Distributed Parallel Execution of transactions of Blocks in Blockchain, Concurrency and Computation: Practice and Experience, 2021, 10.1002/cpe.6804
  9. Shrey Baheti, Shreyas Badiger, and Yogesh Simmhan VIoLET: An Emulation Environment for Validating IoT Deployments at Large-Scales, ACM Transactions on Cyber Physical Systems (TCPS), 5(3), 2021, 10.1145/3446346
  10. Amrita Namtirtha, Animesh Dutta, Biswanath Dutta, Amritha Sundararajan and Yogesh Simmhan Best Influential Spreaders Identification Using Network Global Structural Properties, Nature Scientific Reports, 2021, 10.1038/s41598-021-81614-9
  11. Manoj K Agarwal, Animesh Baranawal, Yogesh Simmhan, Manish Gupta, Event Related Data Collection from Microblog Streams, International Conference on Database and Expert Systems Applications (DEXA), 2021, 10.1007/978-3-030-86475-0_31
  12. Ravi Sahu, Ayush Nagal, Kuldeep Kumar Dixit, Harshavardhan Unnibhavi, Srikanth Mantravadi, Srijith Nair, Yogesh Simmhan, Brijesh Mishra, Rajesh Zele, Ronak Sutaria, Vidyanand Motiram Motghare, Purushottam Kar, and Sachchida Nand Tripathi Robust statistical calibration and characterization of portable low-cost air quality monitoring sensors to quantify real-time O3 and NO2 concentrations in diverse environments, Atmospheric Measurement Techniques (AMT), 14, 37-52, 2021, 10.5194/amt-14-37-2021
  13. Srikrishna Acharya, S Sadgun S Devanahalli, Alok Rawat, Varghese P Kuruvilla, Pratik Sharma, Bharadwaj Amrutur, Ashish Joglekar, Raghu Krishnapuram, Yogesh Simmhan and Himanshu Tyagi, Network Emulation For Tele-driving Application Development, International Conference on COMmunication Systems & NETworkS (COMSNETS), 2021, 10.1109/COMSNETS51098.2021.9352914
  14. Prateeksha Varshney and Yogesh Simmhan, Characterizing application scheduling on edge, fog, and cloud computing resources, Software: Practice and Experience , 50 (5) , 2020 , pp. 558-595, 10.1002/spe.2699
  15. Yogesh Simmhan, Tarun Rambha, Aakash Khochare, Shriram Ramesh, Animesh Baranawal, John Varghese George, Rahul Atul Bhope, Amrita Namtirtha, Amritha Sundararajan, Sharath Suresh Bhargav, Nihar Thakkar and Raj Kiran, GoCoronaGo: Privacy Respecting Contact Tracing for COVID-19 Management , Journal of the Indian Institute of Science, Vol. 100, 2020, 10.1007/s41745-020-00201-5
  16. Shriram Ramesh, Animesh Baranawal and Yogesh Simmhan, A Distributed Path Query Engine for Temporal Property Graphs , IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID) , 2020 , pp. 499-508, 10.1109/CCGrid49817.2020.00-43 [CORE A]
  17. Swapnil Gandhi and Yogesh Simmhan, An Interval-centric Model for Distributed Computing over Temporal Graphs , IEEE International Conference on Data Engineering (ICDE) , pp. 1129-1140, 2020, 10.1109/ICDE48307.2020.00102, [CORE A*]
  18. Srikrishna Acharya, Amrutur Bharadwaj, Yogesh Simmhan, Aditya Gopalan, Parimal Parag and Himanshu Tyagi, CORNET: A Co-Simulation Middleware for Robot Networks , IEEE International Conference on COMmunication Systems & NETworkS (COMSNETS) , 2020 , pp. 245-251, 10.1109/COMSNETS48256.2020.9027459
  19. Dhruv Garg, Prathik Shirolkar, Anshu Shukla and Yogesh Simmhan, TorqueDB: Distributed Querying of Time-Series Data from Edge-local Storage, International Conference on Parallel and Distributed Computing (Euro-Par), Lecture Notes in Computer Science, vol 12247. Springer, 2020, 10.1007/978-3-030-57675-2_18 [CORE A]
  20. Yogesh Simmhan, Aakash Khochare, and Seshadri K. Ramachandra, Chapter: Computing and storage models for edge computing, Edge Computing: Models, technologies and applications Book, 2020, IET, 10.1049/pbpc033e_ch6

See here for all publications

  • IEEE Computer Society Distinguished Contributor, 2021
  • ACM Distinguished Member, 2021 for "Outstanding Scientific Contributions to Computing"
  • IEEE TCSC Award for Excellence in Scalable Computing (Middle Career Researcher), 2020 for contributions to "Big Data Platforms, Programming Models and Dataflow Scheduling on Distributed Systems"
  • Swarna Jayanti Fellowship, 2019-2024. "Scalable Management and Analytics of Temporal Graphs"
  • Best Paper Award, IEEE International Conference on Cloud Computing (CLOUD), 2019. "Adaptive Partition Migration for Irregular Graph Algorithms on Elastic Resources", Dindokar and Simmhan
  • IEEE SCALE Challenge. First Place, 2019. "Dynamic Scaling of Video Analytics for Wide-area Tracking in Urban Spaces", Khochare, et al.
  • EuroPar Distinguished Paper Award, 2018. "VIoLET: A Large-scale Virtual Environment for Internet of Things", Badiger, Baheti and Simmhan
  • IEEE HiPC Best Paper Finalist, 2018. "ARM Wrestling with Big Data: A Study of Commodity ARM64 Server for Big Data Workloads", Jayanth Kalyanasundaram and Yogesh Simmhan
  • IEEE SCALE Challenge. First Place, 2012. "Adaptive Energy Forecasting and Information Diffusion for Smart Power Grids", Simmhan, et al.
  • Microsoft Ship-It Award, 2009. "Microsoft Trident Scientific Workflow Workbench", Barga, et al.
  • IEEE/ACM Supercomputing HPC Storage Challenge. First Place, 2008. "GrayWulf: Scalable Cluster Architecture for Data Intensive Computing", Szalay, et al.

Current Service

Recent Past Service

The primary course I teach is DS256: Scalable Systems for Data Science (3:1), being offered in the Jan semester starting from 2016 at the CDS department. It is a soft-core course for the M.Tech.(CDS) course degree program. The course covers platforms and tools required for developing algorithms, and programming and analyzing Big Data. A major programming project is an essential part of the course, with students working over real-world, large datasets, and using Big Data platforms at scale.

I also teach the DA231: Data Engineering at Scale (3:1) online core course as part of the new M.Tech. in Data Science and Business Analytics (DSBA) program started in Aug, 2021, as part of IISc's push towards online education and degrees targetted at industry professionals. The course trains students in using Big Data platforms to acquire, manage, process and derive insights from large-scale, fast and linked data, while understanding the core distributed systems principles that make these platforms work.

I give lectures on data engineering, Cloud and IoT topics as part of several online certification programs jointly conducted by IISc and TalentSprint, including Computational Data Science, Digital Health and Imaging and Deep Learning: Foundations and Applications.

Till 2020, I co-taught the DS221: Introduction to Scalable Systems (3:0),jointly with Profs. Sathish Vadhiyar and Matthew Jacob. This is a core-course for the M.Tech.(CDS) course degree program. It blends various systems concepts for students with a non-computer science under-graduate major, and introduces architecture, operating systems, data structures, algorithms and programming. It also includes more advanced topics on parallel computing and Big Data platforms.

Earlier, I taught the DS286: Data Structures and Programming (2:1) core course in the Aug semester, sometimes with Prof. Venkatesh Babu. I also co-taught the SE292: High Performance Computing (3:0) core course in the Aug 2014 semester, along with Prof. Govindarajan. Both of these have been discontinued, and their topics absorbed into DS221.

Previously, I offered the SE252: Introduction to Cloud Computing (3:1) as an elective course in the Aug semester. The course covers topics on parallel and distributed computing; IaaS/PaaS/SaaS Clouds; Big Data processing patterns on Clouds; Runtime execution models on Clouds; and Performance evaluation of Cloud applications. Some of these topics are subsumed into DS256.

Current Students

  1. Aakash Khochare Ph.D., CDS (2016 - Present)
  2. Srikrishna Acharya Ph.D., RBCCPS, jointly with Prof.Bharadwaj Amrutur (2017 - Present)
  3. Animesh Baranawal M.Tech.(Research), CDS (2019-present)
  4. Bharati Khanijo Ph.D., CDS (2019 - Present)
  5. Prashanthi S.K. Ph.D., CDS, Prime Minister's Research Fellow (PMRF) (2020 - Present)
  6. Suman Raj Ph.D., CDS, Prime Minister's Research Fellow (PMRF) (2020 - Present)
  7. Varad Vinod Kulkarni Ph.D., CDS (2021 - Present)
  8. Akshat Kumar M.Tech., CDS (2021 - Present)
  9. Jeet Ahuja Mukeshkumar M.Tech., CDS (2021 - Present)
  10. Shreeparna Dey M.Tech., CDS, Wells Fargo Fellow (2021 - Present)
  11. Vidushi Dwivedi M.Tech., CDS, Sony India Software Center Fellow, jointly with Prof. Chirag Jain (2021 - Present)

Current Staff

  • Amrita Namtirtha Postdoc Researcher (2020 - Present)
  • Akarsh Chaturvedi Project Staff (2021 - Present)
  • Harshil Gupta Project Staff (2021 - Present)
  • Sai Anuroop Kesanapalli Project Staff (2021 - Present)
  • Tuhin Khare Project Staff (2020 - Present)

Lab Alumni

The last known affiliation of the lab alumnus is provided
  1. Sunny Anand M.Tech.(CDS), 2021, Uniphore
  2. Swapnil Gandhi M.Tech.(Research), 2020, Microsoft Research
    • CDS Honorable Mention for M.Tech.(Research) Thesis (2020)
  3. Siddharth Jaiswal M.Tech.(Research), 2020, Ph.D. Student, IIT, Kharagpur
  4. Shayal Chhabra M.Tech.(Research), 2020, Microsoft
  5. Shriram Ramesh M.Tech.(CDS), 2020, Wells Fargo
    • IISc Motorola Medal for Best CDS M.Tech.(CDS) Thesis (2020)
  6. Prateeksha Varshney M.Tech.(Research), 2019, Microsoft
    • CDS Honorable Mention for M.Tech.(Research) Thesis (2019)
  7. Shilpa Chaturvedi M.Tech.(Research), 2019, Google
  8. Shrey Baheti M.Tech.(CDS), 2019, Cargill
  9. Nashez Zubair M.Tech.(CDS), 2019, Blaize
  10. Anshu Shukla M.Sc.(Engg.), 2018, Microsoft
    • IISc NetApp Medal for Best CDS M.Sc.(Engg.) Thesis (2019)
  11. Ravikant Dindokar M.Sc.(Engg.), 2018), VMWare
  12. Abhilash Sharma M.Sc.(Engg.), 2018, SkyPoint Cloud
  13. Siva Prakash Reddy Komma M.Tech.(CDS), 2018, Oracle
  14. Rajrup Ghosh M.Tech.(CDS), 2017, Ph.D. Student, USC, Los Angeles
    • IISc Motorola Medal for Best CDS M.Tech.(CDS) Thesis (2017)
  15. Neel Choudhury M.Tech.(CP), 2015, Google
    • IISc Motorola Medal for Best CDS M.Tech.(CP) Thesis (2015)
  16. Tarun Sharma M.Tech.(CP), 2015, Nvidia
  17. Vedsar Kushwaha M.Tech.(CP), 2015, Amazon Web Services

Yogesh has been the recipient on a number of sponsored research grants from agencies of the Government of India, including Ministry of Electronics and Information Technology (MeitY), Ministry of Education (MOE/MHRD), Department of Science and Technology (DST) and Department of Biotechnology (DBT). He has also received funding from the Indo US Science and Technology Forum (IUSSTF). He has been an investigator on proposals cumulatively funded for over INR 130 Million (USD 1.75 Million) at IISc. In the past, he has received grants from the US NSF, DARPA and DOE.

He also actively collaborates with the industry, and is grateful for faculty fellowships, unrestricted grants, Corporate Social Responsibility (CSR) awards, and Cloud credits received from various corporations such as Microsoft, IBM Research, Facebook, VMWare, Accenture, NetApp ATG, Huawei, AWS, TechMahindra, etc. that support his lab's research activities over the years.