[ODE] Record and Playback
Marty Rabens
marty at rabens.com
Tue Jan 27 12:57:32 MST 2004
> Why? All Intel CPUs and clones need to have 100% same behaviour, down
> to the roundoff of the floating point. RTS games typically only send
> user commands between computers, and rely on everything happening in
> lockstep.
Ahh, if only that were true. As a game developer, I absolutely love the
idea of fixed interval updates and reproducible behavior. The benefits
are numerous. Gameplay journaling and playback is a big one. It allows
cool features like ghost racing, instant replay, and downloading other
player's replays from the internet. And it's invaluable when working
with your QA team to reproduce bugs (have the beta build set to always
record). Like you mentioned, it's also critical for network games so
you only need to send player commands between machines.
Unfortunately, different Intel CPUs and compatibles do NOT always give
you identical results with floating point operations. I learned this
the hard way when I relied on it for a 3D RTS. I assumed fp operations
would always give identical results, and most of the time they did. But
very occasionally there would be slightly different results. I tracked
down a specific case I could reproduce in a test. The "test" was a
simple C app that multiplied two float constants together. The last
couple bits of the result were different on an Athlon 850 and an Athlon
1000 (I wish I could find the app now so I knew what the constants were
- I'm hoping I still have a copy somewhere). Anyway, we were in serious
crunch mode to ship the RTS, so I hacked in a "fix" that periodically
truncated the various float values of all entities to a certain
precision. This prevented the fp differences from accumulating so the
networked machines didn't get out of sync. It was really, really ugly,
but it worked well enough for the publisher to go ahead and ship.
Out-of-sync errors still happened once out of every 10 or so hours of
network play (which I wasn't happy with, but we had to ship).
I've done a little research, and it turns out that fp operations on
different CPUs are IEEE compliant, but the IEEE standard allows CPUs to
have differences in the precision of internal representation. They're
required to have a minimum of a certain number of bits for internal
representation, but many (or most) CPUs have more than the minimum, but
how MANY more varies. The upshot of all this is that you can
occasionally get slightly different results on fp operations across
different CPU's.
So what's the solution to all this? I've done some net searching, and
haven't come up with anything conclusive. Different people have
different answers, many of which contradict each other (e.g., some
people say setting the fp precision via processor instructions solves
the problem, others say it doesn't). I've seen it suggested that you
simply can't use floating point math on a PC if you want reproducible
behavior (see
http://www.gamasutra.com/features/20010713/dickinson_03.htm## ).
Someday, I intend to consolidate and condense all the information, do
some thorough testing on proposed solutions, see if I can get some
straight answers on how various commercial games deal with it, and
publish it on a web page. If anyone has any pertaining info, let me
know.
I'm currently using ODE in a 3D mini-golf game we're developing. It
will have turn-based network play. My plan is to hope everything plays
out the same on all players' machines, but send the entire world state
to all machines after each player's stroke to sync up any discrepancies.
Sorry for the long, semi-off-topic rant, but this issue has bugged me
for a long time, and I've never had a clear solution.
Marty Rabens
More information about the ODE
mailing list