M.Tech. in Computational and Data Science

M.Tech. in Computational and Data Science

Program Flyer

Course structure: (effective from Aug 2017 batch)

  • Hard Core: 13 credits
  • Soft Core: 10 credits minimum (atleast three courses)
  • Dissertation: 28 credits
  • Electives: 13 credits (Students may credit CDS electives/soft core or other department courses)

Total: 64 credits


Hard Core Courses (13 credits): All are compulsory

  • DS 221 AUG 3:0 Introduction to Scalable Systems (VSS/YS/MJT) 
  • DS 284 AUG 2:1 Numerical Linear Algebra (MV/SA)
  • DS 288 AUG 3:0 Numerical Methods (SG/PY) 
  • DS 294 JAN 3:0 Data Analysis and Visualization (PY/VB) 
  • DS 200 JAN 0:1 Research Methods (Faculty) – SOFT SKILLS COURSE

Soft Core Courses (10 credits): Minimum three courses out of six below

  • DS 211 JAN 3:0 Numerical Optimization (SA/AM) 
  • DS 222 AUG 3:1 Machine Learning with Large Datasets (PPT) 
  • DS 256 JAN 3:1 Scalable Systems for Data Science (YS) 
  • DS 289 JAN 3:1 Numerical Solution of Differential Equations (AM/SA)
  • DS 290 AUG 3:0 Modelling and Simulation (SR)
  • DS 295 JAN 3:1 Parallel Programming (VSS) 

Dissertation Project: DS 299  0:28  (0:4 Summer; 0:8 AUG; 0:16 JAN)


CDS Electives (all existing electives)

DS 250 AUG 3:1 Multigrid Methods (SG)

DS 252 JAN 3:1 Cloud Computing (YS)

DS 255 JAN 3:1 System Virtualization (JL)

DS 260 JAN 3:0 Medical Imaging (PY)

DS 262 JAN 3:0 Applied and Computational Photonics (MV)

DS 263 AUG 3:1 Video Analytics (RVB)

DS 270 JAN 3:1 Constructive Approximation Theory for Computational Scientists (SA)

DS 291 JAN 3:1 Finite Elements: Theory and Algorithms (SG)

DS 293 AUG 3:1 Topics in Grid Computing (VSS)

DS 301 AUG 2:0 Bioinformatics (KS/DP)

DS 303 AUG 2:0 Chemoinformatics (DP)

DS 305 AUG 3:1 Topics in Web-scale Knowledge Harvesting (PPT)

DS 360 JAN 3:0 Topics in Medical Imaging (PY)

DS 391 JAN 3:0 Data Assimilation to Dynamical Systems (SR)

DS 397 JAN 2:1 Topics in Embedded Computing (SKN)


Course Descriptions

DS 200 (JAN) 0:1 Research Methods

Faculty

This course will develop the soft skills required for the CDS students. The modules  (each spanning 3 hours) that each student needs to complete include: Seminar attendance, literature review, technical writing (reading, writing, reviewing), technical presentation, CV/resume preparation, grant writing, Intellectual property generation (patenting), incubation/start-up opportunities, and academia/industry job search.

Compulsory for all CDS students and all modules needs to be completed by all students (more information)

DS 211 (JAN) 3:0 Numerical Optimization

Sivaram Ambikasaran and Atanu Mohanty

Numerical properties of modified Newton, quasi-Newton, steepest descent, nonlinear conjugate gradient, trust-region methods for unconstrained optimization, line search methods for all problems, simplex, barrier, penalty, sequential quadratic programming, reduced gradient, augmented lagrangian, sequential linearly constrained, Convergence and numerical analysis of algorithms for unconstrained problems, Various methods for solving matrix problems that are relevant to the efficient solution of KKT systems and to solving the sequence of linear problems that arise in optimization algorithms, matrix factorization updating and the linear conjugate gradient algorithm, numerical optimality conditions for smooth optimization problems

Pre-requisites: Basic knowledge of Numerical Methods, linear algebra, and/or consent from the advisor

*Numerical Optimization, J. Nocedal and S. Wright, Springer Series in Operations Research and Financial Engineering, 2006.

*Linear Programming with MATLAB, M. Ferris, O. Mangasarian, and S. Wright, MPS-SIAM Series on Optimization, 2007.

* Practical Methods of Optimization by R. Fletcher 2nd edition, Wiley, 1987.

DS 221 (AUG) 3:0 Introduction to Scalable Systems

Sathish Vadhiyar, Mathew Jacob T., Yogesh Simmhan

Architecture: computer organization, single-core optimizations including exploiting cache hierarchy and vectorization, parallel architectures including multi-core, shared memory, distributed memory and GPU architectures; Algorithms and Data Structures: algorithmic analysis, overview of trees and graphs, algorithmic strategies, concurrent data structures; Parallelization Principles: motivation, challenges, metrics, parallelization steps, data distribution, PRAM model; Parallel Programming Models and Languages: OpenMP, MPI, CUDA; Distributed Computing: Commodity cluster and cloud computing; Distributed Programming: MapReduce/Hadoop model.

Pre-requisites: Basic knowledge of system science and/or consent from the advisor

* Parallel Computing Architecture. A Hardware/Software Approach. David Culler, Jaswant Singh. Publisher: Morgan Kauffman. ISBN: 981-4033-103. 1999.

* Parallel Computing. Theory and Practice. Michael J. Quinn. Publisher: Tata: McGraw-Hill. ISBN: 0-07-049546-7. 2002.

* Computer Systems – A Programmer’s Perspective. Bryant and O’Hallaron. Publisher: Pearson Education. ISBN: 81-297-0026-3. 2003.

* Data Structures, Algorithms, and Applications in C++, 2nd Edition, Sartaj Sahni

* Introduction to Parallel Computing. Ananth Grama, Anshul Gupta, George Karypis, Vipin Kumar. Publisher: Addison Wesley. ISBN: 0-201-64865-2. 2003.

* An Introduction to Parallel Programming. Peter S Pacheco. Publisher: Morgan Kauffman. ISBN: 978-93-80931-75-3. 2011.

* Online references for OpenMP, MPI, CUDA

* Distributed and Cloud Computing: From Parallel Processing to the Internet of Things, Kai Hwang, Jack Dongarra and Geoffrey Fox, Morgan Kaufmann, 2011

* Data-Intensive Text Processing with MapReduce, Jimmy Lin and Chris Dyer, 2010

DS 222 (AUG) 3:1 Machine Learning with Large Datasets

P P Talukdar

Streaming algorithms and Naive Bayes, fast nearest neighbor, parallel perceptrons, parallel SVM, randomized algorithms, hashing, sketching, scalable SGD, parameter servers, graph-based semi-supervised learning, scalable link analysis, large-scale matrix factorization, speeding up topic modeling, big learning and data platforms, learning with GPUs.

Pre-requisites: Prior exposure to machine learning.

* Mining of Massive Dataset. Jure Leskovec, Anand Rajaraman, Jeff Ullman

* Scaling up Machine Learning: Parallel and Distributed Approaches. Ron Bekkerman, Mikhail Bilenko, John Langford

* Foundations of Data Science. Avrim Blum, John Hopcroft, Ravi Kannan

* Research literature

DS 250 (AUG) 3:1 Multigrid Methods

Sashikumaar Ganesan

Classical iterative methods, convergence of classical iterative methods, Richardson iteration method, Krylov subspace methods: Generalized minimal residual (GMRES), Conjugate Gradient (CG), Bi-CG method. Geometric Multigrid Method: Grid transfer, Prolongation and restriction operators, two-level method, Convergence of coarse grid approximation, Smoothing analysis. Multigrid Cycles: Vcycle, W-cycle, F-cycle, convergence of multigrid cycles, remarks on computational complexity. Algebraic Multigrid Method: Hierarchy of levels, Algebraic smoother, Coarsening, Interpolation, remarks on parallel implementation.

Pre-requisites: Good knowledge of Linear Algebra and/or consent from the instructor.

* Pieter Wesseling, An Introduction to Multigrid Methods, R.T. Edwards, Inc., 2004.

* William L. Briggs, Van Emden Henson and Steve F. McCormick, A Multigrid Tutorial, SIAM, 2nd edition, 2000.

DS 252 (AUG) 3:1 Cloud Computing

Yogesh Simmhan

Context: Shared/distributed memory computing; Data/task parallel computing; Role of Cloud computing.

Technology: Cloud Virtualization, Elastic computing; Infrastructure/Platform/Software as a Service (IaaS/PaaS/SaaS); Public/Private Clouds; Service oriented architectures; Mobile, Edge and Fog computing; Multi-clouds.

Application Design Patterns: Workflow and dataflow; Batch, transactional and continuous; Scaling, locality and speedup; Cloud, Mobile and Internet of Things (IoT) applications.

Execution Models: Synchronous/asynchronous patterns; Scale up/Scale out; Data marshalling/unmarshalling; Load balancing; stateful/stateless applications; Performance metrics; Consistency, Availability and Partitioning (CAP theorem).

Programming project using public Cloud infrastructure, e.g. Amazon AWS, Microsoft Azure Cloud resources provided.

Pre-requisites: Data Structures, Programming and Algorithm concepts. Programming experience.

* Distributed and Cloud Computing: From Parallel Processing to the Internet of Things, Kai Hwang, Jack Dongarra and Geoffrey Fox, Morgan Kaufmann, 2011

* Current literature.

DS 255 (JAN) 3:1 System Virtualization

J. Lakshmi

Virtualization as a construct for resource sharing; Re-emergence of virtualization and it’s importance for Cloud computing; System abstraction layers and modes of virtualization; Mechanisms for system virtualization – binary translation, emulation, para-virtualization and hardware virtualization; Virtualization using HAL layer – Exposing physical hardware through HAL (example of x86 architecture) from an OS perspective; System bootup process; Virtual Machine Monitor; Processor virtualization; Memory Virtualization; NIC virtualization; Disk virtualization; Graphics card virtualization; OS-level virtualization and the container model; OS resource abstractions and virtualization constructs (Linux Dockers example) ; Virtualization using APIs – JVM example.

Pre-requisites: Basic course on operating systems and consent of the instructor.

* J. Smith, R. Nair, Virtual Machines: Versatile Platforms for Systems and Processess, Morgan Kaufman, 2005.

* D. Bovet, M. Casti, Understanding the Linux Kernel, Third Edition, O’Reilly, 2005.

* Wolfgang Mauerer, Linux Kernel Architecture, Wiley India, 2012.

* D. Chisnall, The Definitive Guide to the Xen Hypervisor, Prentice Hall, 2007

* R. Bryant, D. O’Hallaron, Computer Systems: A Programmer’s Perspective (2nd Edition), Addison Wesley, 2010

* Current literature.

DS 256 (JAN) 3:1 Scalable Systems for Data Science

Yogesh Simmhan

Design of distributed program models and abstractions, such as MapReduce, Dataflow and Vertex-centric models, for processing volume, velocity and linked datasets, and for storing and querying over NoSQL datasets.

Approaches and design patterns to translate existing data-intensive algorithms and analytics into these distributed programming abstractions.

Distributed software architectures, runtime and storage strategies used by Big Data platforms such as Apache Hadoop, Spark, Storm, Giraph and Hive to execute applications developed using these models on commodity clusters and Clouds in a scalable manner.

This course has a hands-on project where students will work with real, large datasets and commodity clusters, and use scalable algorithms and platforms to develop a Big Data application.

Pre-requisites: Data Structures, Programming and Algorithm concepts with strong programming experience, and DS 221 (or) DS 222 (or) DS 252 (or) consent from the Instructor

* Data-Intensive Text Processing with MapReduce, Jimmy Lin and Chris Dyer, 2010

* Mining of Massive Datasets, Jure Leskovec, Anand Rajaraman and Jeff Ullman, 2nd Edition (v2.1), 2014.

* Current literature

DS 260 (JAN) 3:0 Medical Imaging

Phaneendra K Yalavarthy

X-ray Physics, interaction of radiation with matter, X-ray production, X-ray tubes, dose, exposure, screen-film radiography, digital radiography, X-ray mammography, X-ray Computed Tomography (CT). Basic principles of CT, single and multi-slice CT. Tomographic image reconstruction, filtering, image quality, contrast resolution, CT artifacts. Magnetic Resonance Imaging (MRI): brief history, MRI major components. Nuclear Magnetic Resonance: basics, localization of MR signal, gradient selection, encoding of MR signal, T1 and T2 relaxation, k-space filling, MR artifacts. Ultrasound basics, interaction of ultrasound with matter, generation and detection of ultrasound, resolution. Doppler ultrasound, nuclear medicine (PET/SPECT), multi-modal imaging, PET/CT, SPECT/CT, oncological imaging, medical image processing and analysis, image fusion, contouring, segmentation, and registration.

Pre-requisites: Basic knowledge of system theory and Consent from the instructor.

* Bushberg, J.T., Seibert, J.A., Leidholdt, E.M. Jr., and Boone, J.M., The Essential Physics of Medical Imaging, Second Edn, Lippincott Williams and Wilkins Publishers, Philiadelphia, 2002.

* Wolbarst, A.B., Physics of Radiology, Second Edn, Medical Physics Publishing, Madison, WI, 2005.

* Current Literature

DS 263 (AUG) 3:1 Video Analytics

R. Venkatesh Babu

Introduction to Digital Image and Video Processing, Background Modeling, Object Detection and Recognition, Image and Motion Features, Multi Object Tracking, Trajectory Analysis, Activities and Events, Anomaly Detection, Compressed Domain Video Analytics, Multi Camera Surveillance, Camera Coordination, Video Indexing, Mining and Retrieval. Deep learning for Vision and Image Processing: CNN, RNN, Vision and Language: Image captioning, Visual Q & A.

Pre-requisites: Basic knowledge of Image Processing and machine learning.

* Richard Szeliski, Computer Vision: Algorithms and Applications, Springer 2010

* Forsyth, D.A., and Ponce, J., Computer Vision: A Modern Approach, Pearson Education, 2003.

* Current Literature

DS 270 (JAN) 3:1 Constructive approximation theory for computational scientists

Sivaram Ambikasaran

Approximation by Algebraic Polynomials, Weierstrass Theorem, Muentz-Szasz theorem, Orthogonal polynomials, Optimal interpolation nodes, Optimal interpolant, Lebesgue constants, Fourier series, Gibbs phenomenon, Potential theory and approximation, Spectral methods, Clenshaw-Curtis and Gaussian quadrature, fast low rank construction of kernel matrices and other fast matrix computations.

Pre-requisites: Elementary real analysis and Linear algebra.

*Interpolation and Approximation by Polynomials by George M. Phillips, Publisher: Springer, ISBN-13: 978-0387002156

* Approximation Theory and Approximation Practice by Lloyd N. Trefethen, Publisher: Society for Industrial and Applied Mathematics (SIAM), ISBN-13: 978-1611972399

DS 284 (AUG) 2:1 Numerical Linear Algebra

Sivaram Ambikasaran and Murugesan Venkatapathi

Matrix and vector norms, floating points arithmetic, forward and backward stability of algorithms, conditioning of a problem, perturbation analysis, algorithmic efficiency, Structured matrices, Solving linear systems, Gaussian elimination, LU factorization, Pivoting, Cholesky decomposition, Iterative refinement, QR factorization, Gram-Schmidt orthogonalization, Projections, Householder reflectors, Givens rotation, Singular Value Decomposition, Rank and matrix approximations, image compression using SVD, Least squares and least norm solution of linear systems, pseudoinverse, normal equations, Eigenvalue problems, Gershgorin theorem, Similarity transform, Eigenvalue & eigenvector computations and sensitivity, Power method, Schur decomposition, Jordan canonical form, QR iteration with & without shifts, Hessenberg transformation, Rayleigh quotient, Symmetric eigenvalue problem, Jacobi method, Divide and Conquer, Computing the Singular Value Decomposition, Golub-Kahan-Reinsch algorithm, Chan SVD algorithm, Generalized SVD, Generalized and Quadratic eigenvalue problems, generalized Schur decomposition (QZ decomposition), Iterative methods for large linear systems: Jacobi, Gauss-Seidel and SOR, convergence of iterative algorithms, Krylov subspace methods: Lanczos, Arnoldi, MINRES, GMRES, Conjugate Gradient and QMR, Pre-conditioners, Approximating eigenvalues and eigenvectors.

Pre-requisites: Basic knowledge of multivariate calculus and elementary real analysis

* Biswa Nath Datta, Numerical Linear Algebra and Applications, 2nd Edition, 2004

* Lloyd N. Trefethen and David Bau, III, Numerical linear algebra, SIAM, 1997.

* C. G. Cullen, An Introduction to numerical linear algebra, Charles PWS Publishing, 1994.

* David C. Lay, Linear Algebra and its Applications, Pearson, 2013.

* Golub, G., Van Loan C.F., Matrix Computation, John Hopkins, 1996.

* Saad, Y., Iterative Methods for Sparse Linear Systems, Second Edition, SIAM, 2003

   

DS 288 (AUG) 3:0 Numerical Methods

Phaneendra K Yalavarthy and Sashikumaar Ganesan

Root finding: Functions and polynomials, zeros of a function, roots of a nonlinear equation, bracketing, bisection, secant, and Newton-Raphson methods. Interpolation, splines, polynomial fits, Chebyshev approximation. Numerical Integration and Differentiation: Evaluation of integrals, elementary analytical methods, trapezoidal and Simpson’s rules, Romberg integration, Gaussian quadrature and orthogonal polynomials, multidimensional integrals, summation of series, Euler-Maclaurin summation formula, numerical differentiation and estimation of errors. Optimization: Extremization of functions, simple search, Nelder-Mead simplex method, Powell’s method, gradient-based methods, simulated annealing. Complex analysis: Complex numbers, functions of a complex variable, analytic functions, conformal mapping, Cauchy’s theorem. Calculus of residues. Fourier and Laplace Transforms, Discrete Fourier Transform, z transform, Fast Fourier Transform (FFT), multidimensional FFT, basics of numerical optimization.

Pre-requisites: Basic knowledge of multivariate calculus and elementary real analysis

* Richard L. Burden and J. Douglas Faires, Numerical Analysis: Theory and Applications, India Edition, Cengage Brooks-Cole Publishers, 2010.

* Press, W.H., Teukolsky, S.A., Vetterling, W.T., and Flannery, B.P., Numerical Recipes in C/FORTRAN, Prentice Hall of India, New Delhi, 1994.

* Borse, G.J., Numerical Methods with MATLAB: A Resource for Scientists and Engineers, PWS Publishing Co., Boston, 1997.

DS 289 (JAN) 3:1 Numerical Solution of Differential Equations

A Mohanty and Sivaram Ambikasaran

Ordinary differential equations: Lipschitz condition, solutions in closed form, power series method. Numerical methods: error analysis, stability and convergence, Euler and Runge-Kutta methods, multistep methods, Adams-Bashforth and Adams-Moulton methods, Gear’s open and closed methods, predictor-corrector methods. Sturm-Liouville problem: eigenvalue problems, special functions, Legendre, Bessel and Hermite functions. Partial differential equations: classification, elliptic, parabolic and hyperbolic PDEs, Dirichlet, Neumann and mixed boundary value problems, separation of variables, Green’s functions for inhomogeneous problems. Numerical solution of PDEs: relaxation methods for elliptic PDEs, Crank-Nicholson method for parabolic PDEs, Lax-Wendroff method for hyperbolic PDEs. Calculus of variations and variational techniques for PDEs, integral equations. Finite element method and finite difference time domain method, method of weighted residuals, weak and Galerkin forms, ordinary and weighted/general least squares. Fitting models to data, parameter estimation using PDEs.

Pre-requisites: Basic course on numerical methods and consent of the instructor.

* Arfken, G.B., and Weber, H.J., Mathematical Methods for Physicists, Sixth Edition, Academic Press, 2005.

* Press, W.H., Teukolsky, S.A., Vetterling, W.T., and Flannery, B.P., Numerical Recipes in C/FORTRAN – The art of Scientific Computing, Second Edn, Cambridge University Press, 1998.

* Lynch, D.R., Numerical Partial Differential Equations for Environmental Scientists and Engineers – A First Practical Course, Springer, New York, 2005.

DS 290 (AUG) 3:0 Modelling and Simulation

S Raha

Statistical description of data, data-fitting methods, regression analysis, analysis of variance, goodness of fit. Probability and random processes, discrete and continuous distributions, Central Limit theorem, measure of randomness, Monte Carlo methods. Stochastic Processes and Markov Chains, Time Series Models. Modelling and simulation concepts,Discrete-event simulation: Event scheduling/Time advance algorithms verification and validation of simulation models. Continuous Simulation: Modelling with and Simulation of Stochastic Differential Equations

Pre-requisites: Basic course on numerical methods and consent of the instructor.

* Banks, J., Carson, J.S., and Nelson, B., Discrete-Event System Simulation, Second Edn, Prentice Hall of India, 1996.

* Francois E. Cellier, Ernesto Kofman, Continuous System Simulation, Springer, 2006, ISBN: 0387261028.

* Peter E. Kloden, Eckhard Platen, Numerical Solutions of Stochastic Differential Equations, Springer, Verlog, 1999.

* Peter E. Kloden, Eckhard Platen, Henri Schurz, Numerical Solution of SDE through Computer Experiments, Springer Verlog, 1994

DS 291 (JAN) 3:1 Finite Elements: Theory and Algorithms

Sashikumaar Ganesan

Generalized (weak) derivatives, Sobolev norms and associated spaces, inner-product spaces, Hilbert spaces, construction of finite element spaces, mapped finite elements, two- and three-dimensional finite elements,Interpolation and discretization error, variational formulation of second order elliptic boundary value problems, finite element algorithms and implementation for linear elasticity, Mindlin-Reissner plate problem, systems in fluid mechanics

Pre-requisites: Good knowledge of numerical analysis along with basic programming background and/or consent from the instructor.

* Sashikumaar Ganesan, Lutz Tobiska: Finite elements: Theory and Algorithms, Cambridge-IISc Series, Cambridge University Press, 2017

* Dietrich Braess, Finite Elements: Theory, Fast Solvers, and Applications in Solid Mechanics, Cambridge University Press, 3rd ed., 2007.

* Susanne C. Brenner, Ridgway Scott, The Mathematical Theory of Finite Element Methods, Springer-Verlag, 3rd ed., 2008.

* Current literature

DS 294 (JAN) 3:0 Data Analysis and Visualization

Phaneendra K Yalavarthy and Venkatesh Babu

Data pre-processing, data representation, data reconstruction, machine learning for data processing, convolutional neural networks,  visualization pipeline, isosurfaces, volume rendering, vector field visualization, applications to biological and medical data, OpenGL, visualization toolkit, linear models, principal components, clustering, multidimensional scaling, information visualization.

Pre-requisites: Basic knowledge of numerical methods and consent from instructor

* Hansen, C.D., and Johnson, C.R., Visualization Handbook, Academic Press, 2004.

* Ware, C., Information Visualization: Perception for Design, Morgan Kaufmann, Second Edn, 2004.

* Current literature

DS 295 (JAN) 3:1 Parallel Programming

Sathish Vadhiyar

Parallel Algorithms: MPI collective communication algorithms including prefix computations, sorting, graph algorithms, GPU algorithms; Parallel Matrix computations: dense and sparse linear algebra, GPU matrix computations; Algorithm models: Divide-and-conquer, Mesh-based communications, BSP model; Advanced Parallel Programming Models and Languages: advanced MPI including MPI-2 and MPI-3, advanced concepts in CUDA programming; Scientific Applications: sample applications include molecular dynamics, evolutionary studies, N-Body simulations, adaptive mesh reinements, bioinformatics; System Software: sample topics include scheduling, mapping, performance modeling, fault tolerance.

Pre-requisites: Introduction to Scalable Systems course (or)

Students are expected to be prepared on the slides that will be provided on introduction to parallel computing, OpenMP, MPI, CUDA.

* Parallel Computing. Theory and Practice. Michael J. Quinn. Publisher: Tata: McGraw-Hill. ISBN: 0-07-049546-7. 2002.

* Introduction to Parallel Computing. Ananth Grama, Anshul Gupta, George Karypis, Vipin Kumar. Publisher: Addison Wesley. ISBN: 0-201-64865-2. 2003.

* An Introduction to Parallel Programming. Peter S Pacheco. Publisher: Morgan Kauffman. ISBN: 978-93-80931-75-3. 2011.

* Online references for OpenMP, MPI, CUDA

* Literature: relevant conference and journal papers.

DS 299  0:28 Dissertation Project

This includes the analysis, design of hardware/software construction of an apparatus/instruments and testing and evaluation of its performance. The project work is usually based on a scientific/engineering problem of current interest. Every student has to complete the work in the specified period and should submit the Project Report for final evaluation. The students will be evaluated at the end first year summer for 4 credits. The split of credits term wise is as follows 0:4 Summer, 0:8 AUG, 0:16 JAN.

DS 301 (AUG) 2:0 Bioinformatics

K Sekar

Biological Databases: Organisation, searching and retrieval of information, accessing global bioinformatics resources using internet links. Introduction to Unix operating system and network communication. Nucleic acids sequence assembly, restriction mapping, finding simple sites and transcriptional signals, coding region identification, RNA secondary structure prediction. Similarity and Homology, dotmatrix methods, dynamic programming methods, scoring systems, multiple sequence alignments, evolutionary relationships, genome analysis. Protein physical properties, structural properties – secondary structure prediction, hydrophobicity patterns, detection of motifs, structural database (PDB). Genome databases, Cambridge structure database, data mining tools and techniques, Structural Bioinformatics, Topics from the current literature will be discussed.

Hands on experience will be provided.

* Gribkov, M., and Devereux, J. (Eds), Sequence Analysis Primer, Stockton Press, 1991.

* Mount, D.W., Bioinformatics: Sequence and Genome Analysis, Cold. Spring Harbor Laboratory Press, 2001.

* Baxevanis, A.D., and Ouellette, B.F.F. (Eds), Bioinformatics: A practical guide to the analysis of the genes and proteins, Wiley-Interscience, 1998.

DS 303 (AUG) 2:0 Chemoinformatics

Debnath Pal

Exploring current chemoinformatics resources for synthetic polymers, pigments, pesticides, herbicides, diagnostic markers, biodegradable materials, biomimetics. Primary, secondary and tertiary sources of chemical information. Database search methods: chemical indexing, proximity searching, 2D and 3D structure and substructure searching. Introduction to quantum methods, combinatorial chemistry (library design, synthesis and deconvolution), spectroscopic methods and analytical techniques. Analysis and use of chemical reaction information, chemical property information, spectroscopic information, analytical chemistry information, chemical safety information. Representing intermolecular forces: ab initio potentials, statistical potentials, forcefields, molecular mechanics. Monte Carlo methods, simulated annealing, molecular dynamics. High throughput synthesis of molecules and automated analysis of NMR spectra. Predicting reactivity of biologically important molecules, combining screening and structure ‘SAR by NMR’. Computer storage of chemical information, data formats, OLE, XML, web design and delivery.

* Current Scientific Literature and Web lectures: Lectures posted online.

*  Maizell, R.E., How to find Chemical Information: A guide for Practicing Chemists, Educators, and students, John Wiley and Sons, 1998. ISBN 0-471-12579-2.

* Gasteiger, J., and Engel, T., Chemoinformatics. A Textbook, Wiley-VCH, 2003. ISBN: 3-527-30681-1

DS 305 (AUG) 3:1 Topics in Web-scale Knowledge Harvesting

P P Talukdar

Entity extraction, entity normalization, entity categorization, relation extraction, distant supervision, curriculum learning, knowledge base (KB) inference, open information extraction (OpenIE), temporal inference,ontology evolution, bootstrapped learning, learning from limited supervision in KBs, scalable learning and inference over large datasets for KB construction, recent KB construction systems, multilingual knowledge acquisition, knowledge acquisition from multiple modalities, representation learning for knowledge harvesting.

Pre-requisites: Basic knowledge of machine learning and/or natural languageprocessing will be helpful although not mandatory.

Current Literature.

DS 360 (JAN) 3:0 Topics in Medical Imaging

Phaneendra K Yalavarthy

Three-dimensional Medical Image Processing, Medical Image reconstruction using high performance computing, General Purpose Graphics Processing Units (GP-GPU) computing for Medical Image processing, reconstruction, and Analysis, Computer Aided Detection (CAD) systems – Algorithms, Analysis, Medical Image Registration: rigid and non-rigid registration, Volume based image analysis, Medical Image Enhancement: Deblurring techniques, Four-dimensional Medical Imaging, Molecular Imaging, Diffuse Optical Tomography, and Medical Image Informatics.

Pre-requisites: DS 260 or E9 241 or consent from the Instructor.

* Current Literature

DS 391 (JAN) 3:0 Data Assimilation to Dynamical Systems

S. Raha

Quick introduction to nonlinear dynamics: bifurcations, unstable manifolds and attractors, Lyapunov exponents, sensitivity to initial conditions and concept of predictability. Markov chains, evolution of probabilities (Fokker-Planck equation), state estimation problems. An introduction to the problem of data assimilation (with examples) Bayesian viewpoint, discrete and continuous time cases Kalman filter (linear estimation theory) Least squares formulation (possibly PDE examples) Nonlinear Filtering: Particle filtering and MCMC sampling methods. Introduction to Advanced topics (as and when time permits): Parameter estimation, Relations to control theory, Relations to synchronization.

Pre-requisites: Consent from the Instructor.

* Edward Ott, Chaos in Dynamical Systems, Camridge press, 2nd Edition, 2002.(or one of the many excellent books on dynamical systems)

* Van Leeuwen, Peter Jan, Cheng, Yuan, Reich, Sebastian, Nonlinear Data Assimilation, Springer Verlag, July 2015.

* Sebastian Reich, Colin Cotter, Probabilistic Forecasting and Bayesian Data Assimilation, Cambridge University Press, August 2015

* Law, Kody, and Stuart, Andrew, and Zygalakis, Konstantinos, Data Assimilation, A Mathematical Introduction, Springer Texts in Applied Mathematics, September 2015.

DS 397 (JAN) 2:1 Topics in Embedded Computing

S K Nandy

Introduction to embedded processing, dataflow architectures, architecture of embedded SoC platforms, dataflow process networks, compiling techniques/optimizations for stream processing, architecture of runtime reconfigurable SoC platforms, simulation, design space exploration and synthesis of applications on runtime reconfigurable SoC platforms, additional topics including but not limited to computation models for coarse grain reconfigurable architectures (CGRA), readings and case study of REDEFINE architecture, compiler back-ends for CGRAs.

Pre-requisites: Basic knowledge of digital electronics, computer organization and design, computer architecture, data structures and algorithms, and consent of instructor.

* Current literature.