Do you Assembly?

MULPS — Packed Single-Precision Floating-Point Multiply

I have started looking at assembly to see how Single-Instruction Multiple-Data (SIMD) works, and ended up with PPC’s Altivec and Intel’s SSE. I was successful to optimized C++ source codes using SIMD techniques.

I will look into SSE4.1 because I am also expecting  to receive a new CPU E8400.