Flotsam: Insanely Fast Floating-Point Number Serialization for Java, Javascript

[+] bhouston|12 years ago|reply

Sort of similar to this base64 encoder I wrote to speed up ThreeJS float data streams. Seems more complex though that the stuff that I did.

I had good performance gains versus JSON as evidenced by this popular JSPref:

http://jsperf.com/json-vs-base64

[+] StefanKarpinski|12 years ago|reply

This is a problem that only "walled garden" languages like Java and JavaScript would ever even have in the first place. If you're not trapped in a gilded cage, you just print raw bytes directly to a socket and then read data back the same way. The only concern is byte ordering, which is easy.

Regarding this specific approach, "only 20% overhead" sounds pretty good, but base 64 encoding has "only 33% overhead", is completely general, and likely already has faster implementations than this.

[+] starmole|12 years ago|reply

Hah, I did almost the same thing recently! Precision and rounding was not that important though so i just went with log2 to get the exponent, normalize and cast to int to get the mantissa. Then I realized I could do it properly by just using a typed array :)

[+] unknown|12 years ago|reply

[deleted]

[+] spencertipping|12 years ago|reply

That's a definite possibility. We went with base-94 because it avoids UTF-8 overhead in situations where the browser will treat your data as a string (e.g. no typed-array support). We also wanted copy/paste to work, and for Flotsam data to be embeddable into JSON with minimal expansion.

[+] frugalfirbolg|12 years ago|reply

This could open up some nice possibilities for browser based and Node.js driven distributed computing.

[+] aa0|12 years ago|reply

Could you elaborate? I don't see the application, JS is still ghastly slow in comparison to C.

[+] derricki|12 years ago|reply

Do the performance gains depend on how many digits before or after the decimal point there are?

[+] spencertipping|12 years ago|reply

No, this encoding uses a bitwise encoding for each float, so they are fixed at 10 characters each and encoded with constant speed (except possibly for subnormal numbers, which I've heard are slower in some environments).

10 comments