top | item 45743303

(no title)

chaokunyang | 4 months ago

The Rust benchmarks in Fory are intended more as end‑to‑end benchmarks for typical OOP‑style application scenarios, not just raw buffer write speed.

Protobuf is very much a DOP (data‑oriented programming) approach — which is great for some systems. But in many complex applications, especially those using polymorphism, teams don’t want to couple Protobuf‑generated message structs directly into their domain models. Generated types are harder to extend, and if you embed them everywhere (fields, parameters, return types), switching to another serialization framework later becomes almost impossible without touching huge parts of the codebase.

In large systems, it’s common to define independent domain model structs used throughout the codebase, and only convert to/from the Protobuf messages at the serialization boundary. That conversion step is exactly what’s represented in our benchmarks — because it’s what happens in many real deployments.

There’s also the type‑system gap: for example, if your Rust struct has a Box<dyn Trait> field, representing that cleanly in Protobuf is tricky. You might fall back to a oneof, but that essentially generates an enum variant, which often isn’t what users actually want for polymorphic behavior.

So, yes — we include the conversion in our measurements intentionally, to reflect the real‑world large systems practices.

discuss

order

no_circuit|4 months ago

Yes, I agree that protos usually should only be used at the serialization boundary, as well as the slightly off-topic idea that the generated code should be private to the package and/or binary.

So to reflect the real‑world practices, the benchmark code should then allocate and give the protobuf serializer an 8K Vec like in tonic, and not an empty one that may require multiple re-allocations?

chaokunyang|4 months ago

Yes, you are right. I missed this when I wrote the benchmark code. I will update the benchmark code and share the results here