[ODE] Some SSE in Quick step
tbp
ode at ompf.org
Fri May 28 16:06:00 MST 2004
Pierre Terdiman wrote:
> I'm not sure all of this is really worth it.
Indeed, as on balanced architecture (ie opteron) optimal 387 code vs SSE
doesn't bring much to begin with. But it also has to do with how well
your problem/data/etc lends to SSE.
> Usually you have to optimize your algorithms first. Then, once it's done,
> you often discover that you're now memory-bound anyway. At this point, SSE
> or no SSE doesn't make a difference anymore, et voilà.
SSE brings more than just vector operations: it also asks for a
constrained layout.
And at that point you turn myriads of random 32bits access into nicely
aligned 128 bits mem access with explicit load/store (add better code
density on top of that).
That surely helps on memory-boundedness even if it's no silver bullet
(but then a PC wouldn't be a PC without sucky memory bandwith).
More information about the ODE
mailing list