matmul folder contains three files- cpu_matmul.cpp, MatMul.cu, tiled_matmul.cu
As the name suggests, cpu_matmul.cpp is the cpu implementation of matrix multiplication. MatMul.cu is a simple cuda kernel for matrix multiplication whereas tiled_matmul.cu is tiled version of the MatMul.cu
+------------------------------+
| cpu_matmul.cpp -2700 us |
| MatMul.cu -10.30 us |
| tiled_matmul.cu -9.18 us |
+------------------------------+
The difference between the runtime for MatMul.cu and tiled_matmul.cu is much less than what is theoretically expected.