[ODE] PosR - a better way?
Jon Watte
hplus-ode at mindcontrol.org
Thu Nov 11 14:22:51 MST 2004
> Actually, the whole PosR structure is one aspect I dislike about ode.
> If geoms kept their PosR structure it would be easy to have geoms offset
> from their bodies (rather than needing the extra GeomTransform), and it
> would also save on allocs. In summary, these are my thoughts on it:
But you'd have to copy the matrix around, every single time, and
your cache traffic would be much higher. Paying the cost of the
GeomTransform for the things that actually offset is, to me, a
valid trade-off to gain the cache, storage, and copy winnings of
sharing the matrix.
Allocating matrices should be very efficient with most allocators.
If you really want to use a dumb malloc, then perhaps the best way
to go is to wrap ODE allocations in a shell where you can write a
custom block-slicing allocator, or whatever.
Also not that if a geom has a local offset, it's local to the body,
so you have to do a matrix multiply to get the world-space position,
just like the GeomTransform does. Meanwhile, for the common case
that the geom is identical to the body (typical for characters, or
those brave enough to use trimesh/trimesh), there's significant
computational savings in not having to multiply by an identity
matrix.
> - In particular the terrain's ray test has a PosR alloc and free every
> single time its called, which can be multiple times per frame.
That could easily be avoided.
> - There is an alloc/free every time a geom is attached or detached to a
> body.
This typically happens only on entity creation time, which typically
is very seldom. Getting a cache traffic winnings every world step is
a much bigger win. If you create geoms a lot, then a custom allocator
would be necessary anyway (because the geom itself is also an allocation).
> - Its by and far the most common alloc I have in the game.
Great, so you know what to optimize, and how to optimize it --
assuming your allocs are even close to being a problem, that is. I've
found that, unless you're degenerate, they seldom are.
> - For previous projects we actually had to write a custom PosR
> alloc/free handler to pull these things off a static buffer to avoid
> fragmentation.
I would be happy with a patch for ODE that pulled everything out of
a custom allocator, to avoid fragmentation and better manage memory.
> 2.) Keeping a local PosR per geom may take up more memory.
> - This structure could be reduced to 3 floats for position and 4 floats
> for a quat.
At a significant additional computational cost. Memories aren't
slow enough, yet, to make that trade-off worth it -- I've tried.
> - Local data would save on cache misses.
But storing the same data in multiple places increases cache load and
decreases residency. Typically, you run collision first, which pulls
the matrix into the cache; then when you step, it's still there.
> - If we allowed every geom to have a transform from its body it may be
> slower (an early out may avoid this).
Mis-predicted branches cost 40 cycles. Better to just always do the
add, if that's the way you want to go. However, it's still a multiply-
add, because the local offset has to be in local space to be any good.
> - Basically the current limitation sucks. GeomTransform is clunky.
That's a matter of opinion. You're giving some reasons for why you think
so, and I'm giving some reasons for why I think Russ made a pretty good
choice.
Cheers,
/ h+
More information about the ODE
mailing list