Also, this is just the beginin for adding CPU specific SSE optimizations, but before that we need to add code in configure.in to cehck if the architecture supports SSE and set defines accordingly in config.h.