DS221: Introduction to Scalable Systems

Department of Computational and Data Sciences

Introduction to Scalable Systems

  • Instructor: Sathish Vadhiyar (www | email) and Yogesh Simmhan (www | email)
  • Teaching Assistant: TBD
  • Course number: DS221
  • Credits: 3:1
  • Semester: Aug, 2023
  • Lecture: Tue/Thu 11:30 AM-1 PM, in-person CDS 202 (First class on Aug 1, 2023)
  • Lab: As announced
  • Microsoft Teams Link: Link (Use IISc Office 365 account to login)
  • Survey form for Students (Please fill this after the first class, by Aug 2, 2023)

[DS221 2022 Website] [DS221 2020 Website] [DS221 2019 Website] [DS221 2018 Website][DS221 2017 Website]


This course covers computer systems topics that are essential for students engaging in computational and data sciences. It introduces topics on architecture, OS and data structures that may be new to students without a degree in Computer Science. Then it moves to more advanced topics on tree/graph data structures, HPC/GPGPU programming and Big Data platforms.

Some of the topics covered are:

  • Architecture: computer organization, single-core optimizations including exploiting cache hierarchy and vectorization, parallel architectures including multi-core, shared memory, distributed memory and GPU architectures;
  • Algorithms and Data Structures: Overview of matrices, trees, queues, graphs, concurrent data structures; Using GitHub Copilot/Prompt Engineering; Performance and scalability analysis; Algorithmic complexity analysis and algorithmic strategies;
  • Parallelization Principles: motivation, challenges, metrics, parallelization steps, data distribution, PRAM model;
  • Parallel Programming Models and Languages: OpenMP, MPI, CUDA;
  • Distributed Computing: Commodity cluster and cloud computing; Distributed Programming: Apache Spark programming model for Big Data processing.

This course is a precursor to more advanced courses like DS 295: Parallel Programming, and DS 256: Scalable Systems for Data Science offered in the January term.


This is an introductory crash-course on computer systems, algorithms, HPC and Big Data platforms. So the pre-requisites are minimal: Basic knowledge of computer systems, data structures and programming, and algorithms. However, the course will have a rapid pace and students are expected to pick up the skills rapidly through self-learning.

Grading Scheme

  • Architecture
    • 1 Quiz (10 points)
  • Algorithms and Data Structures
    • One Programming Assignment (20 points)
  • Parallel Programming
    • Written Assignment – 15 points
    • Programming Assignment – 15 points
  • Final Exam (40 points)
    • 20 (Algos, Data Structures and Big Data) + 20 (Parallel Programming)
  • Sessionals (50 points)
    • Architecture quiz – 10 points, Half of Algorithms and Data Structures programming assignment – 10 points, Parallel programming written assignment – 15 points, Parallel programming programming assignment – 15 points
  • Terminal (50 points)
    • Half of Algorithms and Data Structures programming assignment – 10 points, Final exam – 40 points


  • Data Structures, Algorithms, and Applications in C++, 2nd Edition, Sartaj Sahni
  • Parallel Computing Architecture. A Hardware/Software Approach. David Culler, Jaswant Singh. Publisher: Morgan Kauffman. ISBN: 981-4033-103. 1999.
  • Parallel Computing. Theory and Practice. Michael J. Quinn. Publisher: Tata: McGraw-Hill. ISBN: 0-07-049546-7. 2002.
  • Computer Systems – A Programmer’s Perspective. Bryant and O’Hallaron. Publisher: Pearson Education. ISBN: 81-297-0026-3. 2003.
  • Introduction to Parallel Computing. Ananth Grama, Anshul Gupta, George Karypis, Vipin Kumar. Publisher: Addison Wesley. ISBN: 0-201-64865-2. 2003.
  • An Introduction to Parallel Programming. Peter S Pacheco. Publisher: Morgan Kauffman. ISBN: 978-93-80931-75-3. 2011.
  • Online references for OpenMP, MPI, CUDA
  • Lecture slides

Tentative Schedule

  • Architecture, VSS (4 lectures, Aug 1-10)
    • Computer organization including cache hierarchy, locality
    • Quiz (10 points)
  • Algorithms and Data Structures, YS (8 lectures, Aug 17- Sept 12)
    • Algorithmic analysis and complexity
    • Basic Data Structures: Arrays, Stacks, Queues, Trees
    • Prompt Engineering using Copilot to design code
    • Performance Analysis and Scaling studies
    • Advanced Data Structures: Hashmap, Search trees, Graphs
    • Algorithmic Strategies
    • Programming Assignments (20 points)
  • Introduction to Big Data Platforms, YS (3 lectures, Sept 14 – Sep 21)
    • Big Data programming models: Apache Spark
    • One Programming Assignment (10 points)
  • Parallel Computing, VSS (10 lectures, Sep 26 – Nov 26)
    • Parallel architectures pdf
    • Shared and Distributed memory architectures
    • Many-core architectures pdf
    • Parallelization Principles pdf
    • Parallel Programming Models and Languages
      • Shared memory: OpenMP pdf
    • Distributed Memory: MPI – pt2pt, collectives pdf
    • GPUs: CUDA pdf
    • Sample Parallel Algorithms
      • Sorting, Searching pdf
    • Parallel Gaussian Elimination pdf
    • An ideal parallelism model – PRAM pdf
      • Reading: Book “Parallel Computing: Theory and Practice” by Michael J Quinn, Available with me. Pages 25-32, 40-42, 256
    • Prefix computations pdf
    • Assignment and Continuous Assessments (10+20 points)
  • Final Exam (30 points), December 7 Forenoon, 2023

NOTE: Besides these lectures, there will be several tutorials as well. The schedule for these will be announced in class.


  • Parallel architecture – Grama et al. – 2.4, Many-core – Google for NVIDIA A100 white paper
  • Parallelization principles – Grama et al. – 3.1, 3.5, 5.1-5.6; Culler and Singh – 2.2, 2.3
  • OpenMP tutorial: http://www.llnl.gov/computing/tutorials/openMP
  • MPI online tutorial: “MPI complete reference”. Google for it.
  • CUDA: Google for NVIDIA CUDA programming guide.
  • PRAM: Reading: Book “Parallel Computing: Theory and Practice” by Michael J Quinn. Pages 25-32, 40-42, 256
  • Parallel algorithms
    • Parallel Quick sort – the corresponding portion in the text book by Grama et al.
    • BFS
      • Paper: A Scalable Distributed Parallel Breadth-First Search Algorithm on BlueGene/L. Yoo et al. SC 2005. (Pages 1-7)
      • Paper: Accelerating large graph algorithms on the GPU using CUDA. Harish and Narayanan. HiPC 2007. (Page 5-8)
  • For the rest, read the lecture slides.


Teaching cluster notes on OpenMP and MPI for the assignment

Academic integrity

Any assignment or exam you complete in this course is expected to be your own work. You should attempt homework questions and programming assignments on your own, but are allowed to verbally discuss concepts with your peers. However, the solutions must be written in your own words. Direct use of material, ideas, figures, code or data authored by someone else without appropriate acknowledgement is not allowed.
Please note that there will be no tolerance of academic dishonesty. If someone is academically dishonest in this course, they will receive a grade of F, and the misconduct will be brought to the notice of the department Chair. See https://www.iisc.ac.in/about/student-corner/academic-integrity/ for more details.