[ODE] Some SSE in Quick step
devicezero
devicezero at hotpop.com
Tue May 25 13:37:57 MST 2004
IMHO, D3D Matrix stuff is faster becouse is very good c++ coded. And all
D3D math part are good and faster.
Also, miracles can come if you compile with /Ox which mean FULL
OPTIMIZITATION.
Premature code optimizations don't have any sense some times, becouse
can cut off the global optimizations.
DirectX (D3D) have D3DXMATRIXA16, it's public from D3DXMATRIX.
typedef D3DX_ALIGN16 _D3DXMATRIXA16 D3DXMATRIXA16
where D3DX_ALIGN16
#if _MSC_VER >= 1300 // VC7
#define D3DX_ALIGN16 __declspec(align(16))
#else
#define D3DX_ALIGN16 // Earlier compiler may not understand this, do
nothing.
#endif
and from headers:
//---------------------------------------------------------------------------
// Aligned Matrices
//
// This class helps keep matrices 16-byte aligned as preferred by P4 cpus.
// It aligns matrices on the stack and on the heap or in global scope.
// It does this using __declspec(align(16)) which works on VC7 and on VC 6
// with the processor pack. Unfortunately there is no way to detect the
// latter so this is turned on only on VC7. On other compilers this is the
// the same as D3DXMATRIX.
//
// Using this class on a compiler that does not actually do the alignment
// can be dangerous since it will not expose bugs that ignore alignment.
// E.g if an object of this class in inside a struct or class, and some code
// memcopys data in it assuming tight packing. This could break on a
compiler
// that eventually start aligning the matrix.
//---------------------------------------------------------------------------
#ifdef __cplusplus
typedef struct _D3DXMATRIXA16 : public D3DXMATRIX
{
_D3DXMATRIXA16() {}
_D3DXMATRIXA16( CONST FLOAT * );
_D3DXMATRIXA16( CONST D3DMATRIX& );
_D3DXMATRIXA16( CONST D3DXFLOAT16 * );
_D3DXMATRIXA16( FLOAT _11, FLOAT _12, FLOAT _13, FLOAT _14,
FLOAT _21, FLOAT _22, FLOAT _23, FLOAT _24,
FLOAT _31, FLOAT _32, FLOAT _33, FLOAT _34,
FLOAT _41, FLOAT _42, FLOAT _43, FLOAT _44 );
// new operators
void* operator new ( size_t );
void* operator new[] ( size_t );
// delete operators
void operator delete ( void* ); // These are NOT virtual; Do not
void operator delete[] ( void* ); // cast to D3DXMATRIX and delete.
// assignment operators
_D3DXMATRIXA16& operator = ( CONST D3DXMATRIX& );
} _D3DXMATRIXA16;
#else //!__cplusplus
typedef D3DXMATRIX _D3DXMATRIXA16;
#endif //!__cplusplus
More information about the ODE
mailing list