[ODE] Choleski factorization
Antonio_Martini@scee.net
Antonio_Martini at scee.net
Tue Mar 25 09:12:02 2003
still on the iterative solution, i remember from the work of Mirtich that
if we have only one constraint(one contact normal)
the problem is reduced to the solution of only a linear system. So no sub
LCP problem must be solved, but just a linear
set of equations for each single constraint. I tried it end i end up with
something like f = fabs(a/m), im simulationg friction
using a method described by Shoemer, otherwise for more accurate static
friction the scalar equation will became
a linear system of 3 equations in 3 unknowns.
Sergio Valverde <svalverde@barcelona.ubisoft.es>@q12.org on 25/03/2003
15:58:12
Sent by: ode-admin@q12.org
To: david@csworkbench.com
cc: ode@q12.org
Subject: RE: [ODE] Choleski factorization
Very good job, David!
I can't wait to play with your new dWorldStepFast()!
Why don't you post the corresponding five files
while the final (and "clean") version is
integrated within ODE?
Sergi
-----Original Message-----
From: david@csworkbench.com [mailto:david@csworkbench.com]
Sent: domingo, 23 de marzo de 2003 9:33
To: ode@q12.org
Subject: Re: [ODE] Choleski factorization
I know this is a long message, so I'll summarize my questions here:
- Should I release the iterated solution in the same files as their
current equivalents or separate them out?
- After the iterated solution is done, would you rather see SIMD
extensions or a verlet integrator for particles (cloth, water, etc) next?
- If SIMD (slightly ahead of the verlet integrator in my mind), then
what's the best way to get cross-platform/compiler assembly into the
project?
- Plus a few internals questions below.
And now the long version.....
I'm not sure how close what I came up with is to PP, but it's working
great! I switched over to LDLT and got immediate improvements from just
that switch, I'm guessing because of all the loop unrolling and not having
to do Sqrt's. Then I clamped x's that are outside of lo and hi
(calculated dynamically if findex > -1), remove their rows and columns
from A with a slightly modified dRemoveLDLT (A didn't use row pointers),
update rhs, and resolve until no more constraints go out of range (usually
only once), and voila! a 150% speed increase over dSolveLCP (and should
give the exact same results 95% of the time). I know this algorithm will
most likely slow to a crawl with larger systems where it may have to
resolve several times, but it works great when it doesn't, and since I'm
doing only one joint at a time, it will usually be either impossible or
really weird to do more than one resolve. One thing I need to do is
change the contact joint so the hard contact constraint is at the end of J
instead of the beginning, since it seems more likely that a contact would
be limiting penetration rather than friction, and limits at the end are a
lot easier to remove than limits at the beginning. Would this do anything
to mess up the current algorithm?
Now the major bottleneck (takes 4 times as long as any other operation) is
computing A. I did find an optimization for that, but I was surprised it
wasn't already implemented, since it would help the old version too. So I
wanted to ask about it to see if I should make this a change to the
original function or make a new one. Here it is:
Correct me if I'm wrong, but I believe A is always symmetric (as well as
positive definite). This means that A's top right triangle mirrors it's
lower left, and possibly (I don't know about this one) every block has
mirrored top and bottom triangles...? However, in the step where A is
calculated (JinvM * J') a call is made to Multiply2_p8r for each block.
Multiply2_p8r calculates every entry in the block, top and bottom half,
separately. Shouldn't it calculate the top half and copy the results to
the bottom half as it goes (or when it's done)? I know that would work for
the diagonal blocks in the current matrix, but I'm not sure about the
others. Should I change the current function, or write another version
that exploits the symmetry, and change the current algorithm to use that
one for the diagonal? The real question is whether every block in A is
symmetric or only the ones along the diagonal.
Another question: how do you want this integrated into ODE? Right now I
have it integrated into the current source files, so you just replace 4 or
5 files and you can start calling dWorldStepFast(world, stepsize,
iterations). I've made a few changes to the body class, to store it's
inertial tensor and Mass matrix in the global frame. That way I don't
have to recalculate it for each joint that is connected to it, only once
each iteration. So that file will have to be replaced (or at least those
2 lines). Other than that, I can take the rest of the code out and put it
in separate files (and figure out where to change the makefile) if you'd
rather it be separated. Just let me know.
I guess my next bought of hacking will be SIMD :) What's the most
cross-platform/compiler way to get assembly into the project? I guess I
could just write it once for gcc inline and once for vc++ inline, but that
leaves out the compilers I don't have or care to learn. So what does that
leave, GAS and NASM? Anyone have a preference (I'll have to learn either
one.... which is "best")?
Then I could do a verlet integrator.... anybody interested in cloth and
other particle effects (seen some cool water demos)? There's only two
constraints: distance (also known as stick constraints) and the angle
between 2 sticks. It shouldn't be too difficult to implement them as
inequality constraints. The integrator and the collision detection are
really integrated, so a simloop would call normal collision and step
routines, then the particle engine, and the particles would "flow" around
the rest of the bodies. There's a paper on the Hitman game and an article
on Gammasutra that explain the concept better, I'll find the links if
you're interested. The basic concept is that you just make a particle at
each of the vertices of a mesh, and then the mesh would act like cloth and
drape over anything it came in contact with, or you could make it
weightless and move one particle up and down to make a rippling water
effect. You could attach particles to a point on a rigid body. The hard
part is going to be getting the rigid bodies to interact with the
particles (i.e. if you have a couple boxes sitting on a table cloth, they
should move with it if you start dragging the cloth around). The
algorithm itself is an iterated one, but the less general constraints make
the computations required to integrate a step much faster.
So which do y'all want first, more optimization or more functionality?
Thanks for all the help,
David
>
> hi,
>
> it sounds like you're getting close to the "principal pivoting"
> algorithm for LCP solving. the PP algorithm works like this:
>
<snip>
>
>> My question is, is a Cholesky Factorization also a factorization for
>> all of its sub-matrices?
>
> if you have the factorization A=L*L', then you also have the
> factorizations A(1:n,1:n)=L(1:n,1:n)*L(1:n,1:n)' for all n (in matlab
> notation). other submatrix factorizations can be gotten incrementally,
> see the ODE source for an incremantal factorization updater.
>
> russ.
>
> --
> Russell Smith
> http://www.q12.org
> _______________________________________________
> ODE mailing list
> ODE@q12.org
> http://q12.org/mailman/listinfo/ode
_______________________________________________
ODE mailing list
ODE@q12.org
http://q12.org/mailman/listinfo/ode
_______________________________________________
ODE mailing list
ODE@q12.org
http://q12.org/mailman/listinfo/ode
**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
postmaster@scee.net
This footnote also confirms that this email message has been checked
for all known viruses.
**********************************************************************
SCEE 2003