[ODE] Looking for developers for commiting patches to UNSTABLE
Tanguy Fautre
tanguy.fautre at spaceapplications.com
Thu Mar 24 17:34:36 MST 2005
Hi,
Rodrigo Hernandez wrote:
>
>
> I've been thinking about adding some SIMD to the multiplication routines
> for x86 cpus,
> I have not started any work on that area, however I'd gladly commit the
> changes
> (my Idea was to submit a patch once done) when I am done.
I think it ought to be done if it really provides a performance boost.
But be aware that a lot of compiler have the option of using SSE/SSE2
instructions and registers instead of the FPU for regular floating point
computations.
So the modifications would have to provide something more; i.e. really
using the parallelism of SIMD, not just the extra registers.
> my only concern is the SIMD detecting routines, a new function
> ("dDetectSIMD"?) would
> be needed in order to find which SIMD set the CPU supports
> (MMX,SSE,SSE2,3DNow) and
> then redirect the function calls to the properly optimized routine.
>
> Of course, To keep back compatibility, if the function is never called,
> the library defaults to non
> SIMD optimized routines.
>
Runtime detection is one possibility, although you can also use the user
configuration file to set this at compilation time.
But the real problem is you must have a single and a double precision
path, to follow current ODE philosophy. Ideally, if ODE was compiled
with single precision you have to use SSE, while for double precision
you have to use SSE2. So, ideally, you have to write twice the same
routines :-(.
Another possibiliy is to target only x86-64 that features 64bits
registers and SSE2. That way you could use ODE's configure.c to detect
that the platform is an x86-64 (patch already posted before) and then
compile the assembly routines you wrote in that case.
That way you do not need to detect at runtime if the CPU supports
SSE/SSE2 nor do you need the user explicitly stating he wants SSE at
compilation.
This last option may seem extreme (leaving all x86 CPU unable to use the
SSE routines), but in fact providing SSE for x86 will keep all x86-64
from using it (x86 and x86-64 assembly not being compatible).
Cheers,
Tanguy
More information about the ODE
mailing list