Flatbuffers by Google – CapnProto alternative

[+] kentonv|11 years ago|reply

Detailed (though perhaps biased) comparison I wrote up a few months ago:

http://kentonv.github.io/capnproto/news/2014-06-17-capnproto...

[+] tbastos|11 years ago|reply

Even though capnproto would be our first choice, the lack of support for Windows/CMake is kind of a party killer. FlatBuffers doesn't offer everything we need either, but its codebase is simpler to grasp and hack, so it may end up being the safer choice... which is unfortunate

[+] halayli|11 years ago|reply

I found it interesting that they didn't benchmark it against capnproto (http://google.github.io/flatbuffers/md__benchmarks.html)

[+] kevinbowman|11 years ago|reply

Wow: "For applications on Google Play that integrate this tool, usage is tracked" without even an option to disable that. Sure, it's open source so can be changed by editing the source code, but does anyone else find that kinda creepy?

e.g. if I make an app using 10 FOSS libraries, then I wouldn't want my app reporting to 10 different places everything which the user is doing.

Also, on the actual homepage for it (http://google.github.io/flatbuffers), the only mention of this call-home feature is buried at the bottom of the "building" page.

[EDIT] This is incorrect; see below comments about this tracking not being a call-home feature but instead just Google scanning apps submitted to the Play Store

[+] eridius|11 years ago|reply

I'm a bit confused. Is it actually calling home? That seems kind of unlikely, given that a) that seems pretty egregious for a library like this, and b) they said this doesn't affect the application at all beyond consuming "a few bytes".

Could it instead be just that Google scans Google Play apps for a string in the binary that matches the Flatbuffers version string format? That seems more likely given what the README does say about this. And it also seems more useful in general; Google would benefit more from knowing how many applications use the library than knowing how popular these applications are.

[+] on_and_off|11 years ago|reply

I don't see an issue with adding a string letting the Play Store scanner know that we use this lib. It is reporting who the users of this lib are (the apps that implement it), and not informing on the app users themselves.

It seems like a very good way to jauge the interest for this lib on Android in order to decide how much resource they will allow to its dev.

[+] sandGorgon|11 years ago|reply

Again, taking this from a previous conversation on this topic - https://news.ycombinator.com/item?id=7904443 - it seems CapnProto and Flatbuffers are much faster in C++, Go and Rust... the benchmarks may be very different in Javascript, Python, Ruby, etc.

It would be really interesting (and possibly more relevant for HN) to have benchmarks based on one dynamic language - say Python.

Oh and @kentonv - I'm not a native American English speaker (rest of the world really). I really, really have trouble pronouncing Capn'Proto. Even more difficult to pronounce it in a meeting and have people recall/Google it.

[+] kentonv|11 years ago|reply

To be clear, the thing that you'd think would be a problem in dynamic languages -- lack of pointer arithmetic -- actually isn't a problem. Every language has a way to extract values from a byte string, e.g. the `struct` module in Python, TypedArrays in Javascript, ByteBuffer in Java, etc.

The real problem in dynamic languages is that they tend to be worse at inlining accessor functions. This is not really because inlining is impossible -- v8 can do it -- but because most dynamic languages don't prioritize performance in the first place and so haven't implemented such optimizations. This is actually a problem in Go as well, weirdly. Because of this, if you actually intend to consume most of the content of a message, it may make sense to parse it into a language-native data structure up front so that access doesn't need to go through accessor functions. Most Cap'n Proto implementations support this. Doing this will still be much faster than using Protobufs because the Cap'n Proto format is naturally faster to decode.

As David says, "Cap'n" should be pronounced like "happen", though pronouncing it as "captain" is OK as well (and will still get people to the right place if they Google it).

[+] dwrensha|11 years ago|reply

For what it's worth, I pronounce "Cap'n" to rhyme with "happen", and sometimes I fall back to saying "Captain Proto".

[+] userbinator|11 years ago|reply

I looked at their implementation at http://google.github.io/flatbuffers/md__internals.html and found this rather confusing paragraph:

Strings are simply a vector of bytes, and are always null-terminated. Vectors are stored as contiguous aligned scalar elements prefixed by a 32bit element count (not including any null termination).

So... does the count include the null terminator byte or not?

[+] ultimape|11 years ago|reply

I think the first use of the term 'vector' is more conceptual - but is actually defining a string type that is implemented using a c-style string strategy. The second mention "Vector" is a more direct reference to the C++/Java Vector class and its imlementation.

Technically speaking, you an implement a c-style string using a STD:Vector by ignoring the length preamble and ensuring room is made for the null character. I got away with this in my intro to c++ class after showing the teacher that I already knew how to implement strings in C from a previous class.

C-style strings: http://www.learncpp.com/cpp-tutorial/66-c-style-strings/

C++ Vector class: http://en.cppreference.com/w/cpp/container/vector

Java Vector Class: http://docs.oracle.com/javase/7/docs/api/java/util/Vector.ht...

[+] zeroxfe|11 years ago|reply

That sounds like an unambiguous no to me (null-terminator not included in count.)

[+] jhallenworld|11 years ago|reply

I used the C preprocessor as the schema compiler in my serialization library: https://github.com/jhallen/joes-sandbox/tree/master/lib/sdu

[+] ultimape|11 years ago|reply

Cool. I'd love to know more about what makes your system awesome - It is a very creative idea! Have you thoguht about creating a DTD out of the schema or vice-versa. Having a DTD to validate the file against would allow for some serious robustness in hot-loading stuff from the web.

I think the big draw for the flatbuffer system is that it can stream data in with a low memory foot-print.

[+] desdiv|11 years ago|reply

Related discussion 130 days ago: https://news.ycombinator.com/item?id=7901991

[+] WhitneyLand|11 years ago|reply

So when did it become acceptable to leave tracking code turned on by default (opt out) in open source repos?

[+] maxerickson|11 years ago|reply

Are there rules attached to the plurality of (so called) open source licenses, or even OSI approved licenses? Not really.

Are you finding this particular tracking code acceptable? Apparently not.

So there was never any coherent whole that could have found something unacceptable to begin with and in the end there are still disparate parts that continue to find it unacceptable.

I guess this is an obvious and tiresome answer, but I'm not sure what else you would expect anyone to say.

[+] kevb|11 years ago|reply

It's a version string, which as far as I know, has been acceptable since the beginning of open source. They just let us know that Google Play scans APKs for that string. I imagine Google Play also scans for other libraries, open source or otherwise.

https://github.com/google/flatbuffers/blob/master/include/fl...

31 comments