[ODE] Some SSE in Quick step

devicezero devicezero at hotpop.com
Tue May 25 13:37:57 MST 2004

IMHO, D3D Matrix stuff is faster becouse is very good c++ coded. And all 
D3D math part are good and faster.

Also, miracles can come if you compile with /Ox which mean FULL 
Premature code optimizations don't have any sense some times, becouse 
can cut off the global optimizations.

DirectX (D3D) have D3DXMATRIXA16, it's public from D3DXMATRIX.


where D3DX_ALIGN16

#if _MSC_VER >= 1300  // VC7
#define D3DX_ALIGN16 __declspec(align(16))
#define D3DX_ALIGN16  // Earlier compiler may not understand this, do 

and from headers:

// Aligned Matrices
// This class helps keep matrices 16-byte aligned as preferred by P4 cpus.
// It aligns matrices on the stack and on the heap or in global scope.
// It does this using __declspec(align(16)) which works on VC7 and on VC 6
// with the processor pack. Unfortunately there is no way to detect the
// latter so this is turned on only on VC7. On other compilers this is the
// the same as D3DXMATRIX.
// Using this class on a compiler that does not actually do the alignment
// can be dangerous since it will not expose bugs that ignore alignment.
// E.g if an object of this class in inside a struct or class, and some code
// memcopys data in it assuming tight packing. This could break on a 
// that eventually start aligning the matrix.
#ifdef __cplusplus
typedef struct _D3DXMATRIXA16 : public D3DXMATRIX
    _D3DXMATRIXA16() {}
    _D3DXMATRIXA16( FLOAT _11, FLOAT _12, FLOAT _13, FLOAT _14,
                    FLOAT _21, FLOAT _22, FLOAT _23, FLOAT _24,
                    FLOAT _31, FLOAT _32, FLOAT _33, FLOAT _34,
                    FLOAT _41, FLOAT _42, FLOAT _43, FLOAT _44 );

    // new operators
    void* operator new   ( size_t );
    void* operator new[] ( size_t );

    // delete operators
    void operator delete   ( void* );   // These are NOT virtual; Do not
    void operator delete[] ( void* );   // cast to D3DXMATRIX and delete.
    // assignment operators
    _D3DXMATRIXA16& operator = ( CONST D3DXMATRIX& );


#else //!__cplusplus
#endif //!__cplusplus

More information about the ODE mailing list