Write a sequential program that multiplies a lower triangular matrix, L, with a vector, X, to produce a vector Y, i.e, LX=Y. Parallelize the loop(s) using OpenMP. Use both the static and dynamic scheduling options for the OpenMP for construct.
Use a lower triangular matrix of size 3000x3000, and a vector X of size 3000, initializing both with random values. Execute your OpenMP program with 2,4,8,16 and 32 threads, and report execution times and speedups. Report the execution times in a table, and report speedup as a graph with the x-axis for number of threads, and y-axis for speedups over sequential program. For each experiment with a fixed number of threads, and for the sequential program, run 5 times, and obtain the average execution time across these 5 runs. Report times and speedup for both the static and dynamic scheduling options.
Give a report on the results and observations.