[ODE] GCC Stack Size, stepfast disabling bodies, and SIMD idea
david@csworkbench.com
david at csworkbench.com
Fri Apr 4 03:35:01 2003
Does anyone know how big of a stack GCC allocates.... and how to change
that? Reading through the man pages, -mstack-size doesn't seem to be an
option for i386 processors. I'm fairly sure my latest test_crash is
segfaulting because of a stack overflow (I'm playing with ~800 bodies and
~10,000 joints..... with the body disabling code it still runs at >1
fps... not real-time, but viable if someone wanted to use ODE for a movie
scene). The problem is, ODE will segfault (some body or geom points to
inaccessible memory) with this big of a scene, but not with a scene half
the size, so I'm pretty sure it's a stack overflow.
I've got the body enabling code set up to spread the enabled island by a
parameter every step. Disabled bodies act like geom-only infinite mass
things if they are connected to an enabled body by a joint. Even with
that parameter set to 1, the block wall crash still looks physically
correct, and the enabled bodies spread out in a wave from the point of
contact as expected. With the re-disabling code (still in test_crash....
I'll get it in the source eventually), the wave of enabled bodies is
followed by a wave of disabled bodies, so, depending on the size of the
wall, only a percentage of the wall is enabled. With the mu parameter for
contact joints connecting wall blocks set fairly low, you can "punch" a
hole in the wall, and the rest of the wall remains largely disabled. The
only problem I've seen is that if you are too aggressive with your
re-disabling parameters, the bottom of the wall can fall out from under
the top, and the top just hangs there..... But that behavior can be
tweaked away.
Be on the lookout for another beta release in the next couple days. I
should have auto(dis|en)abling bodies all in the stepfast source, with API
calls to change their parameters.
I thought of an idea of how to implement the math functions so that simd
support could be detected at runtime. Use a function table, like the
joints do internally, to define a dMath "class". Then when a world is
created (or maybe add a dMathInit function...), use that time to run some
detection code to decide what type of sse/sse2/3dnow/387 fp math to use,
and point the math class's pointers at the correct function set for the
processor (these functions may be defined in inline asm, an external asm
file, or c). Then all through the code, make calls to
dMath->multiplyAdd0_331(A,B,C) for example, and the correct function would
get called for your processor. Then we have runtime support for all
processors in a single executable. The downside being that no math code
will be inlined or macro'd, but linked in SIMD code should be faster than
inlined 387 code anyway, and I know of no way to both have run-time code
choice and compile time code inlining... Only people without SIMD capable
processors would be (slightly) slower with this change than without. What
do you think about that approach (I'm assuming it's what the DirectX
classes do.... maybe they use derived classes and virtual functions, but
the same idea)?
David