[ODE] Looking for developers for commiting patches to UNSTABLE

Tanguy Fautre tanguy.fautre at spaceapplications.com
Thu Mar 24 17:34:36 MST 2005


Hi,


Rodrigo Hernandez wrote:
> 
> 
> I've been thinking about adding some SIMD to the multiplication routines 
> for x86 cpus,
> I have not started any work on that area, however I'd gladly commit the 
> changes
> (my Idea was to submit a patch once done) when I am done.

I think it ought to be done if it really provides a performance boost.

But be aware that a lot of compiler have the option of using SSE/SSE2 
instructions and registers instead of the FPU for regular floating point 
  computations.

So the modifications would have to provide something more; i.e. really 
using the parallelism of SIMD, not just the extra registers.

> my only concern is the SIMD detecting routines, a new function 
> ("dDetectSIMD"?) would
> be needed in order to find which SIMD set the CPU supports 
> (MMX,SSE,SSE2,3DNow) and
> then redirect the function calls to the properly optimized routine.
> 
> Of course, To keep back compatibility, if the function is never called, 
> the library defaults to non
> SIMD optimized routines.
> 

Runtime detection is one possibility, although you can also use the user 
configuration file to set this at compilation time.

But the real problem is you must have a single and a double precision 
path, to follow current ODE philosophy. Ideally, if ODE was compiled 
with single precision you have to use SSE, while for double precision 
you have to use SSE2. So, ideally, you have to write twice the same 
routines :-(.


Another possibiliy is to target only x86-64 that features 64bits 
registers and SSE2. That way you could use ODE's configure.c to detect 
that the platform is an x86-64 (patch already posted before) and then 
compile the assembly routines you wrote in that case.
That way you do not need to detect at runtime if the CPU supports 
SSE/SSE2 nor do you need the user explicitly stating he wants SSE at 
compilation.

This last option may seem extreme (leaving all x86 CPU unable to use the 
SSE routines), but in fact providing SSE for x86 will keep all x86-64 
from using it (x86 and x86-64 assembly not being compatible).


Cheers,

Tanguy



More information about the ODE mailing list