Reproducing Box2D simulations is hard

Requirement

In Neon Made I am allowing the user to build whatever they want and then click a button to see what happens. An important requirement here is that if the user builds something and clicks “play” many times the game needs to do the exact same thing. Any divergence is inacceptable, as users might want to modify whatever they build in a somewhat controlled way.

What already worked for me

My physics simulation runs on Liquidfun, which is based on Box2D, which allows to exactly reproduce results if the exact input is recreated. At least on the same device. I’ve not tested how reproducibility fares over different browsers or platforms, but as I do not expect people to switch back and forth while playing that isn’t as bad.

It’s a pretty fragile concept however. Even tiny changes in the input quickly yields completely different results. A few key points to get Box2D to reproduce simulations as perfect as it gets are for me:

  • When the user clicks “play” I save the state and directly reload it. My Loader-Code reorders objects a little from the ordering at which they were placed. A quick and dirty solution, but it works and I’d like to save the game on play anyway.
  • Always use a new b2World for each play. Free the old one.
  • Standard things like a fixed time step and making all gameplay code only use that gametime.

How I broke it

The data I store about constructions that players are building can get big. It’s json and it ends up in the localStorage. A big construction can be as big as 1 megabyte. A few weeks ago I was implementing some compression for it (which was a nice success, cutting size down to 30%), and I had the stupid idea that I might be able to save space by cutting away a part of my floating point numbers. They’re quite ugly, often they look like this:

23.476537456365

Obviously I do not need that much precision. 23.4765 would be totally fine for what I do. So all those extra numbers just eat storage space. A quick bit of code later I had this kind of “optimization”:

// using underscore.js

var reduceNumbersAccuracy = function(o, decimals) {
	if (_.isNumber(o)) {
		return Number(o.toFixed(decimals));
	} else if (_.isArray(o)) {
		return o.map(function(entry) {
			return reduceNumbersAccuracy(entry, decimals);
		});
	} else if (_.isObject(o)) {
		Object.keys(o).forEach(function(k) {
			o[k] = reduceNumbersAccuracy(o[k], decimals);
		});
		return o;
	} else {
		return o;
	}
};

For example Number(23.476537456365.toFixed(5)) yields 23.47654 so the resulting objects indeed were smaller.

Great this saved me 20% on my already compressed data.

… Except that this completely breaks reproducibility.

The main issue here is that I am using Emscripten to get Liquidfun into the Browser. Liquidfun, being basically Box2D, uses float32 internally. Emscripten seems to honor this.

So what is happening with my optimization for each number X is this:

  1. X is created by the user as a number inside Box2D. For example a position of a b2Body.
  2. X is saved by my code. This means Number(X.toFixed(5)), which is a Javascript 64bit Number, is stored.
  3. X is loaded again. Or rather what I saved, so Number(X.toFixed(5)). Javascript takes that 64 bit Number and throws it at Box2D. Box2D puts it into a Float32Array, casting it to Float32 in the process.

Really bad. It turns out there are a lot of numbers that are not stable this way. You can find them like this in the console:

var failCnt = 0;
var testCnt = 10000;
for (var i = 0; i < testCnt; i++) {
	// Math.fround makes initialNumber a value that is a valid 32bit float. Just like Box2D would with all of its values.

	var initialNumber = Math.fround(Math.random() * 42);
	var savedNumber = Number(initialNumber.toFixed(5));
	var loadedNumber = Math.fround(savedNumber);
	if (loadedNumber !== initialNumber) {
		failCnt++;
		console.log("A problem number is: " + initialNumber);
	}
}
console.log("Overall it seems ~" + ((failCnt / testCnt)*100) + "% of all numbers are a problem"); 

This spams your console with ~8100 numbers, as around 81% seems to be unstable. At least right now for me in Chrome.

For my simulation this meant random changes in the outcome of the simulation. Interestingly enough sometimes it stabilized, so it seems repeating this process a few times can yield stable numbers that get through the process unchanged. But I’ve decided for now I’ll put this idea to rest and removed the code.

Removing floating point accuracy wasn’t the greatest idea I had I guess. In future posts I’ll write about things that were better ideas :)

Written on May 4, 2016