[ODE] More speed???

Wed Nov 5 10:42:45 MST 2003

Nguyen Binh wrote:

> MCM> That's surprising, since they're different time complexities.
>      Yes! You mean O(n*n) and O(n*log(n))???

Actually, O(n*n*n) vs. O(n).  Check out the stepfast section in the ODE 
docs.

> MCM>   What's the size of your islands?  How many bodies are you using?
> MCM>  Did you use some of the stepfast tests for comparisons, the ones
> MCM> with hundreds of bodies?
> 
>      Yes, my test scene has about 200 bodies (sphere, box, cylinder)
>      and I "feel" that original Step is not too slow compare to
>      StepFast.(Not try to profile yet so I don't have any stats here)

Wow, that really is surprising.  Can you post a test case and your code? 
  The "n" above, that determines the speed, is the number of DOF 
removed.  Are these bodies connected together?  Are they colliding?  The 
stepfast tests kind of maximize that.  Have you tried them?

> MCM> What exactly is surprising?  That it takes a long time to solve a big
> MCM> set of linear equations?
>      Yes! It takes almost a half of CPU power to solve that linear
>      problem. That surprises me.

Well, that is the heart of the constraint satisfaction, i.e. it's doing 
all the hard work of balancing all the constraints, so it doesn't 
surprise me.  What does surprise me is that the LU decomposition of the 
matrix (O(n^3)) takes less time than the substitution steps.

>>>           - Or we can solve them using "iterative" method
> 
> MCM> That's what Stepfast does, as you point out below.
 >
>      Actually, no. StepFast use "iterative" method to interate all
>      joints but not use "iterative" method to solve linear problem.

True, now that I think about it, while they're inspired by the same 
idea, the details are quite different.  But solving the linear system 
iteratively would be at best O(n^2), since the matrix A has O(n^2) 
elements, and you need to touch each one.  Is that true?  Is A sparse? 
Maybe that explains why the LU decomposition is so fast.

>      I've read that the best (efficient and easy to implement) way to
>      solve linear problem A.x = b when A is PSD is using interative
>      method "Conjugate Gradient".

Conjugate gradient is for finding the minimum of an arbitrary function 
that has a single output.  It more or less assumes that the function 
you're trying to minimize is locally quadratic.  It doesn't directly 
apply to solving linear equations, and if you modify the problem to fit 
(e.g. by minimizing ||A.x - b||^2), it's guaranteed to be slower that 
directly solving it, I think.

>>>   b) Stepfast code path:
>>>      ... one way to optimize is using specialized LCP for small matrix
> 
> 
> MCM> How exactly would you specialize it for a 6x6 matrix?  For 2x2 or 3x3 
> MCM> you could write out a single formula, which would save the 
> MCM> storage/retreval of a few intermediate values.  But I doubt that would 
> MCM> save you much with 6x6, and would just complicate the code, and might 
> MCM> blow the instruction cache for example.  What did you have in mind here? 
> MCM>   Just unrolling loops?
> 
>        We need more experiments here.I didn't try this path yet but I
>        have an impression that when using StepFast, we need to solve
>        linear problem A.x = b (A is nxn) with 2<= n <=6. Am I wrong?
>        And with my test scene of 200 primitive bodies, I observed that
>        in most case n is 2.

This will be true if most constraints are contact, and I think you need 
friction of zero in one direction to boot.  If your constraints are 
regular joints, n is 3 for ball and socket, 4 for slider & universal & 
hinge2, 5 for hinge, 6 for fixed.  If you use the joint motor or joint 
limits, the numbers go up.

At least one of the stepfast tests has a bunch of cars.  Cars use hinges 
and hinge2 joints.  You should really test your code with that as well.

You make a good point about the varying sizes though; there are 
certainly cases where the individual matrices are small, and it would be 
great to handle at least 2x2 and 3x3 separately, and maybe the others as 
well.  That could speed things up quite a bit!

>        So I have an idea to write specialized
>        routine for StepFast. Actually, using LCP things of original
>        Step for StepFast is not good as the data input is hugely
>        diferrent.

Very well put.

Cheers,
Martin

#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################