[ODE] Some SSE in Quick step

Pierre Terdiman pierre.terdiman at novodex.com
Fri May 28 15:06:40 MST 2004


I'm not sure all of this is really worth it.

Usually you have to optimize your algorithms first. Then, once it's done,
you often discover that you're now memory-bound anyway. At this point, SSE
or no SSE doesn't make a difference anymore, et voilà.

FYI we only use plain C++ in the Novodex SDK, for example.

BTW, for the guy that previously asked about OPCODE's capsule query.... He
was right, the code wasn't "optimized". I modified it, the whole
capsule/AABB overlap code is now 12 lines long. The profiler tells me it's
faster, but not that much.

- Pierre

PS: in case you care, the code is below. Conservative test, with obvious
precomputed data I'll let you figure out.

inline_ BOOL LSSCollider::LSSAABBOverlap(const Point& center, const Point&
extents)
{
 // Stats
 mNbVolumeBVTests++;

 float dcx = mSCen.x - center.x;
 float ex = extents.x + mRadius;
 if(fabsf(dcx)>ex + mFDir.x) return FALSE;

 float dcy = mSCen.y - center.y;
 float ey = extents.y + mRadius;
 if(fabsf(dcy)>ey + mFDir.y) return FALSE;

 float dcz = mSCen.z - center.z;
 float ez = extents.z + mRadius;
 if(fabsf(dcz)>ez + mFDir.z) return FALSE;

 if(fabsf(mSDir.y * dcz - mSDir.z * dcy) > ey*mFDir.z + ez*mFDir.y) return
FALSE;
 if(fabsf(mSDir.z * dcx - mSDir.x * dcz) > ex*mFDir.z + ez*mFDir.x) return
FALSE;
 if(fabsf(mSDir.x * dcy - mSDir.y * dcx) > ex*mFDir.y + ey*mFDir.x) return
FALSE;

 return TRUE;
}


----- Original Message ----- 
From: "tbp" <ode at ompf.org>
To: "GARY VANSICKLE" <g.r.vansickle at worldnet.att.net>
Cc: <ode at q12.org>
Sent: Friday, May 28, 2004 2:26 PM
Subject: Re: RE : [ODE] Some SSE in Quick step


GARY VANSICKLE wrote:
> AFAIK, nobody's actually ever verified that D3DX's mathematical
> manipulations are slower than ODE's built-in.  Which is part of why a
> referred to this as a "non-argument".
Convenient.

>>Then there's no need to do some jumptable chachacha à la DirectX when
>>you're dynamically linking Ode: link against the proper dll (ie
>>ode-sse2.dll) at run time.
>
>
> As a wise man once sang, "We'd all love to see the plan" ;-).  I'm
guessing
> D3DX does at least as well as that.
Guessing is the key word. Sing with me.

>  It's certainly not doing anything like:
>
> If(SSE2)
> {
> SSE2MatrixMultiply();
> }
> Else if(SSE)
> {
> SSEMatrixMultiply();
> }
> Else if...
>
> if that's what you're getting at.
No. Read what i've said again. "jumptable chachacha" has nothing to do
with the code you've posted.

Picking the proper dll at runtime may be equivalent (more or less) to
some jumptable adjustment (ie on windows), but it puts some of the
burden on the toolchain as it relies on compiler/linker mechanisms.

and
>>That just require some rather thin per platform gluing.

> And a huge number of DLL's/.so's,
You can't have the cake and eat it.

> which will need to keep growing on an
> almost monthly basis as new processors come out with new features.
Yeah sure.

> I say let Bill do it; he already is anyway.
Again those 2 solutions (linking against different ode libs vs D3D)
aren't equivalent.
In the former every paths are optmized for one combo (cpu/features).

And you're still conveniently forgetting that Bill only cares about a
subset of Ode supported platforms (i know i know, it's one of your
famous non-argument).


_______________________________________________
ODE mailing list
ODE at q12.org
http://q12.org/mailman/listinfo/ode






More information about the ODE mailing list