===========================================
DS221: INTRODUCTION TO SCALABLE SYSTEMS
===========================================
POSTED: 18 SEP, 2018
DUE DATE: 7 OCT, 2018, 11:59PM
POINTS: 150
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1) 2D Matrices [75 points]

a) You will define a matrix interface (mat2d.h) with the following member functions.
-- void init(int N, int M); This initializes an array that will contain N rows and M columns. The values in the initial matrix are assumed to be 0. This method must be called exactly once when creating the matrix, and subsequent calls to this method will should return a "logic_error" exception.
-- void append(float[] row_vals); If append is called for the first time, it sets the values of row 0 to the values passed in row_vals. Else, it sets the values of the next row of the matrix since the last call to append to these values. row_vals must have exact M items in it, and you cannot append more then N rows to the matrix. If either of these are violated, you should throw an "out_of_range" exception.
-- float get(int i, int j); This returns the value present in row i and column j of the matrix, indexed from 0. You should throw an "out_of_range" exception if the row and column values are outside the valid range of rows and columns.
 
b) You will develop two different implementations for this matrix interface: using a regular 2-D array of size NxM (mat_arr.cpp), and a Compressed Sparse Row (CSR) representation (mat_csr.cpp). [15 points]

c) Define a function to load a matrix from an input Tab Separated Value (TSV) file containing N+1 lines. The first line contains two integers separated by a tab, that have the number of rows N and number of columns M in the matrix. The remaining N lines represent the numbers in each row of the matrix, and each line has M numbers separated by tabs that represent the values present in the respective row and columns. E.g., the following is a valid input matrix file for a 3x2 matrix. 

3	2
10.91	-5.25
21.81	0.0
32.71	39.21

Analyze the space complexity, and measure the actual space taken by different runs of this code for different matrix sizes in powers of 2 until you run out of memory. Do the analysis and the experiments match? Discuss. [15 points]

d) Define a function to add two matrices using the given interface, and both implementations. Analyze the time complexity when both input matrices and the output matrix are arrays, and when all three are CSR. Compare the analytical model with experiments of different matrix sizes and discuss the results. [15 points]

e) Define a function to multiply two matrices using the given interface, and both implementations. Analyze their time complexity and experimental results for different matrix sizes. What is the largest matrix size that you can support using each implementation? Discuss. [15 points]

f) Load the adjacency list for a real-world road network [1] into the sparse matrix implementation. Assume the edges have unit weights. Perform a BFS traversal from a given source vertex and store the traversal order and depth in an STL list implementation as List<[depth, List<vid>]>. Print the contents of the list at the end as "depth, vid1, vid2, ...", with each vertex at a depth present in sorted order. What is the time and space complexity for the traversal? Compare this with experimental results. What is the largest graph you can load? Discuss. [15 points]

[1] http://www.dis.uniroma1.it/challenge9/download.shtml