[ODE] Looking for developers for commiting patches to UNSTABLE

Rodrigo Hernandez kwizatz at aeongames.com
Thu Mar 24 11:23:41 MST 2005


At 10:34 AM 24/03/2005, Tanguy Fautre wrote:

>I think it ought to be done if it really provides a performance boost.
>
>But be aware that a lot of compiler have the option of using SSE/SSE2 
>instructions and registers instead of the FPU for regular floating 
>point  computations.
>
>So the modifications would have to provide something more; i.e. really 
>using the parallelism of SIMD, not just the extra registers.

Indeed, I already have a matrix multiplication routine, I noticed that the 
boost comes from doing many multiplications per frame, as doing only one 
actually reduces
performance (because of all the preparation that needs to be done before 
calling the actual sse2 operation, in any case we will find out when there 
is something already there.

>>my only concern is the SIMD detecting routines, a new function 
>>("dDetectSIMD"?) would
>>be needed in order to find which SIMD set the CPU supports 
>>(MMX,SSE,SSE2,3DNow) and
>>then redirect the function calls to the properly optimized routine.
>>Of course, To keep back compatibility, if the function is never called, 
>>the library defaults to non
>>SIMD optimized routines.
>
>Runtime detection is one possibility, although you can also use the user 
>configuration file to set this at compilation time.
>
>But the real problem is you must have a single and a double precision 
>path, to follow current ODE philosophy. Ideally, if ODE was compiled with 
>single precision you have to use SSE, while for double precision you have 
>to use SSE2. So, ideally, you have to write twice the same routines :-(.

Well, the runtime approach I consider better because the end user is 
usually not the one compiling the library, in order to allow the developer 
to reach a wide audience, detection should be made at runtime, this has the 
drawback that you lose the speed increase from inline functions :( but you 
may get a boost from naked function calls.

I have already written runtime CPU and SIMD detection routines for my 
engine, so that's an already done job.


>Another possibiliy is to target only x86-64 that features 64bits registers 
>and SSE2. That way you could use ODE's configure.c to detect that the 
>platform is an x86-64 (patch already posted before) and then compile the 
>assembly routines you wrote in that case.
>That way you do not need to detect at runtime if the CPU supports SSE/SSE2 
>nor do you need the user explicitly stating he wants SSE at compilation.
>
>This last option may seem extreme (leaving all x86 CPU unable to use the 
>SSE routines), but in fact providing SSE for x86 will keep all x86-64 from 
>using it (x86 and x86-64 assembly not being compatible).

Initially there would be a lot of "Place your CPU asm here" holes, I dont 
own an x86-64 processor yet, nor do I own a Mac, so I think the 
infrastructure of the code should
allow extendibility for other architectures, falling back to a default if 
no optimized code is in, in other words, expect a lot of #ifndef 's :).

Cheers!


>Cheers,
>
>Tanguy
>
>_______________________________________________
>ODE mailing list
>ODE at q12.org
>http://q12.org/mailman/listinfo/ode

Rodrigo Hernandez, lonewolf programmer
Aeon Games
http://www.aeongames.com 



More information about the ODE mailing list