This is my work on the assignments in this repository.
I used intrinsics to work with SSE registers instead of writing a whole function in assembly. See: RWTHmoodle: Aufgabenstellung SSE-Instruktionen
I also compared the speedup of optimized (
-O2) binaries, to make the comparison more applicable in the real world. See: RWTHmoodle: Compileroptimierungsstufen
I also compared the hand optimized versions to versions with OpenMP. The source is in mkroening/gi4_uebung07/impl07-openmp.