[ODE] Bug or feature?
Rodrigo Hernandez
kwizatz at aeongames.com
Fri Apr 8 08:54:35 MST 2005
Yeah, I looked at the FAQ after I posted, felt kind of silly for not
RTMF, though I did try to find some info on the programming guide.
The FAQ reads:
Q Why is dVector3 the same size as dVector4? Why is a dMatrix not 3x3 ?
A For making it SIMD friendly. i.e 16 byte aligned (there is no SIMD
code in there right now, but there may be in the future).
to which I thought, isnt it rather that the variable location must be on
a 16 byte alligned memory location? (IE float x; if(&x%16==0)
fprintf(stdout,"X is 16 byte aligned\n"); )
Like I said before I have written an inline asm matrix multiplication
function using SSE and I was under the impression that all it really
took was to do this:
#ifdef __GNUC__
#define ALIGN16 __attribute__ ((aligned (16)))
#elif _MSC_VER
#define ALIGN16 __declspec(align(16))
#else
#define ALIGN16
#endif
ALIGN16 float vector[3];
Now, I did try turning dVector3 into a real Vector[3], seems like that
extra padding is being used for something else other than padding, I
find this to be a bug, since, well if you're going to use that extra
padding, why not use dVector4 if that is what you mean? using dVector3
is missleading I think.
So what do you guys think?
Charlie Hui wrote:
> >From the docs
>
> "Matrix operations like factorization are expensive, so we must store
> the data in a way that is most useful to the matrix code. I want to do
> 4-way SIMD optimizations later, so the format is this: store the
> matrix by rows, and each row is rounded up to a multiple of 4
> elements. The extra "padding" elements at the end of each row/column
> must be set to 0. This is called the "standard format". Hopefully this
> decision will remain good in the future, as more and more processors
> have 4-way SIMD (especially for fast 3D graphics).
>
> The exception: matrices that have only one column or row (vectors),
> are always stored as consecutive elements in standard row format, i.e.
> there is no interior padding, only padding at the end.
>
> Thus: all 3x1 floating point vectors are stored as 4x1 vectors:
> (x,x,x,0)."
>
More information about the ODE
mailing list