[ODE] Some SSE in Quick step

devicezero devicezero at hotpop.com
Tue May 25 13:37:57 MST 2004


IMHO, D3D Matrix stuff is faster becouse is very good c++ coded. And all 
D3D math part are good and faster.

Also, miracles can come if you compile with /Ox which mean FULL 
OPTIMIZITATION.
Premature code optimizations don't have any sense some times, becouse 
can cut off the global optimizations.



DirectX (D3D) have D3DXMATRIXA16, it's public from D3DXMATRIX.

typedef D3DX_ALIGN16 _D3DXMATRIXA16 D3DXMATRIXA16

where D3DX_ALIGN16

#if _MSC_VER >= 1300  // VC7
#define D3DX_ALIGN16 __declspec(align(16))
#else
#define D3DX_ALIGN16  // Earlier compiler may not understand this, do 
nothing.
#endif


and from headers:

//---------------------------------------------------------------------------
// Aligned Matrices
//
// This class helps keep matrices 16-byte aligned as preferred by P4 cpus.
// It aligns matrices on the stack and on the heap or in global scope.
// It does this using __declspec(align(16)) which works on VC7 and on VC 6
// with the processor pack. Unfortunately there is no way to detect the
// latter so this is turned on only on VC7. On other compilers this is the
// the same as D3DXMATRIX.
//
// Using this class on a compiler that does not actually do the alignment
// can be dangerous since it will not expose bugs that ignore alignment.
// E.g if an object of this class in inside a struct or class, and some code
// memcopys data in it assuming tight packing. This could break on a 
compiler
// that eventually start aligning the matrix.
//---------------------------------------------------------------------------
#ifdef __cplusplus
typedef struct _D3DXMATRIXA16 : public D3DXMATRIX
{
    _D3DXMATRIXA16() {}
    _D3DXMATRIXA16( CONST FLOAT * );
    _D3DXMATRIXA16( CONST D3DMATRIX& );
    _D3DXMATRIXA16( CONST D3DXFLOAT16 * );
    _D3DXMATRIXA16( FLOAT _11, FLOAT _12, FLOAT _13, FLOAT _14,
                    FLOAT _21, FLOAT _22, FLOAT _23, FLOAT _24,
                    FLOAT _31, FLOAT _32, FLOAT _33, FLOAT _34,
                    FLOAT _41, FLOAT _42, FLOAT _43, FLOAT _44 );

    // new operators
    void* operator new   ( size_t );
    void* operator new[] ( size_t );

    // delete operators
    void operator delete   ( void* );   // These are NOT virtual; Do not
    void operator delete[] ( void* );   // cast to D3DXMATRIX and delete.
   
    // assignment operators
    _D3DXMATRIXA16& operator = ( CONST D3DXMATRIX& );

} _D3DXMATRIXA16;

#else //!__cplusplus
typedef D3DXMATRIX  _D3DXMATRIXA16;
#endif //!__cplusplus



More information about the ODE mailing list