(no title)
cranx
|
1 month ago
I find the title a bit misleading. I think it should be titled It’s Faster to Copy Memory Directly than Send a Protobuf. Which then seems rather obvious that removing a serialization and deserialization step reduces runtime.
bluGill|1 month ago
Protobuf also handles a bunch of languages for you. The other team wants to write in a "stupid language" - you don't have to have a political fight to prove your preferred is best for everything. You just let that team do what they want and they can learn the hard way it was a bad language. Either it isn't really that bad and so the fight was pointless, or it was but management can find other metrics to prove it and it becomes their problem to decide if it is bad enough to be worth fixing.
vlovich123|1 month ago
MrDarcy|1 month ago
dietr1ch|1 month ago
Protobuf is likely really close to optimally fast for what it is designed to be, and the flaws and performance losses left are most likely all in the design space, which is why alternatives are a dime a dozen.
infogulch|1 month ago
> Protobuf performs up to 6 times faster than JSON. - https://auth0.com/blog/beating-json-performance-with-protobu... (2017)
That's a 30x faster just by switching to a zero-copy data format that's suitable for both in memory use and network. JSON services spend 20-90% of their compute on serde. A zero copy data format would essentially eliminate it.
cmrdporcupine|1 month ago
Go around doing this kind of pointless thing because "it's only 5x slower" is a bad assumption to make.
jeffbee|1 month ago
satvikpendem|1 month ago
nicman23|1 month ago
miroljub|1 month ago
Just doing memcpy or mmap would be even faster. But the same Rust advocates bragging about Rust speed frown upon such unsecure practices in C/C++.
infogulch|1 month ago
mrlongroots|1 month ago
And the reason is ABI compatibility. Reasoning about ABI compatibility across different C++ versions and optimization levels and architectures can be a nightmare, let alone different programming languages.
The reason it works at all for Arrow is that the leaf levels of the data model are large contiguous columnar arrays, so reconstructing the higher layers still gets you a lot of value. The other domains where it works are tensors/DLPack and scientific arrays (Zarr etc). For arbitrary struct layouts across languages/architectures/versions, serdes is way more reliable than a universal ABI.
unknown|1 month ago
[deleted]
lenkite|1 month ago