=========================================== DS221: INTRODUCTION TO SCALABLE SYSTEMS =========================================== POSTED: 3 OCT, 2017 DUE DATE: 13 OCT, 2017, 11:l59PM POINTS: 100 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1) You will implement three different approaches for storing integer numbers in a data structure and searching for a given item in it. Implement all you C++ code within the file names provided, and with the following command-line parameters as: search_XYZ input_file search_count search_item_1 search_item_2 ... where "search_XYZ" corresponds to one of the 3 executable files from below; the "input_file" is the name of the file containing integers in the input, one entry per line; "search_count" is the number of items you are searching for; and "search_item_n ..." is the list of items you are searching for, one at a time, over the inputs. Print one line of output for each searched item, of the form "match,search_item_n,true|false". e.g. if the input file numbers10e5.txt did not contain 5 and 264, but contained the other two, your command-line input and output using BST would be: search_bst numbers10e5.txt 3 264 9857 159672 5 match,264,false match,9857,true match,159672,true match,5,false Use additional header files as necessary. A sample template is provided for 1(c) below. Besides the source code, you will prepare plots and discussion as part of a report file, named username.pdf. You will measure the search performance of the three approaches for 5 different inputs with 10e5, 10e6 and 10e7 numbers in each (input files given separately), and perform searches for 100 items in each. You should measure the time taken for each search, performed one at a time, and plot the average time over all the 100 searches. Report one data point for each input file. Also report the peak memory usage for each data structure as observed through top. Discuss the expected time complexity and the observed search time. Points are separately given for the correctness of the code and the outputs, and for the analysis of the results. (a) [10 points] Store the input items in a Linked List and search for each given item over them using a linear scan. You may use existing linked list implementations in the STL. [File: search_llist.cpp] (b) [20 points] Sort the items using STL's Quick Sort (qsort) function and store them in an array. Use binary search over this sorted array to locate the given item. [File: search_bsarray.cpp] (c) [30 points] Implement the insert and search functions of a Binary Search Tree as given in BST.h. Then perform binary search over the BST to locate the given item. [File: search_bst.cpp] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2) Implement a Graph data structure as an adjacency list, with the list holding an array of neighboring vertices. Each vertex will have an integer as vertex ID, while the edge will have a source and sink vertex ID, along with an integer weight. Load the graph from an input CSV file where each row adjacency list that contains "source_vertex_id,sink_vertex_1_id,weight1,sink_vertex_2_id,weight2,...". Assume that the graph is undirected, and the input file contains entries from source to sink and sink to source. e.g., a sample file for directed graph in slide 12 of the lecture is: 1,2,3,4,4,5,2 2,1,2,3,6 3,2,6,4,1 4,1,4,3,1,5,1 5,1,2,3,5,4,1 (a) [20 points] For a given source vertex ID, perform a Breadth First Search (BFS) traversal and print the ID of vertices in the order that are visited. You may use queue data structures available in STL. For each visit, you should print the vertex ID visited, and its edge-count distance from the source. e.g. the result of performing a BFS from vertex 1 in the above sample is: 1,0 2,1 5,1 4,1 3,2 (b) [20 points] For a given source vertex ID, perform Djikstra's Single Source Shortest Path (SSSP) to every other vertex. For each vertex, you should print the vertex ID and its edge-weight distance from the source. You may use priority queue data structures available in STL. e.g. the result of performing a SSSP from vertex 1 in the above sample is: 1,0 5,2 4,3 2,3 3,4 Save you C++ code under the filename graphs.cpp with the command-line parameter as: graphs input.txt [bfs | sssp] source_vertex_id =========================================== SUBMISSION INSTRUCTIONS 1) Your code MUST compile and execute on the turing head node. Use a Makefile with "clean" and "all" as targets to allow all the code to be compiled. 2) Code that fails the compilation or execution will get 0 points. We will test with alternate inputs as well, besides the ones provided as part of the assignment. 2) tar and gzip your Makefile, source files, executables, and username.pdf into a single file called username-a2.tar.gz, where "username" is your turing cluster user name. Follow the naming conventions given above. 3) Email it to simmhan@iisc.ac.in before 11:59PM on Oct 13, 2017. The subject line should be "ds221-a2-username". Only a single submission will be accepted, and late submissions will not be accepted.