top | item 42436071

(no title)

dpeckett | 1 year ago

To be honest I kind of find myself drifting away from gRPC/protobuf in my recent projects. I love the idea of an IDL for describing APIs and a great compiler/codegen (protoc) but there's just soo many idiosyncrasies baked into gRPC at this point that it often doesn't feel worth it IMO.

Been increasingly using LSP style JSON-RPC 2.0, sure it's got it's quirks and is far from the most wire/marshaling efficient approach but JSON codecs are ubiquitous and JSON-RPC is trivial to implement. In-fact I recently even wrote a stack allocated, server implementation for microcontrollers in Rust https://github.com/OpenPSG/embedded-jsonrpc.

Varlink (https://varlink.org/) is another interesting approach, there's reasons why they didn't implement the full JSON-RPC spec but their IDL is pretty interesting.

discuss

order

elcritch|1 year ago

My favorite serde format is Msgpack since it can be dropped in for an almost one-to-one replacement of JSON. There's also CBOR which is based on MsgPack but has diverged a bit and added and a data definition language too (CDDL).

Take JSON-RPC and replace JSON with MsgPack for better handling of integer and float types. MsgPack/CBOR are easy to parse in place directly into stack objects too. It's super fast even on embedded. I've been shipping it for years in embedded projects using a Nim implementation for ESP32s (1) and later made a non-allocating version (2). It's also generally easy to convert MsgPack/CBOR to JSON for debugging, etc.

There's also an IoT focused RPC based on CBOR that's an IETF standard and a time series format (3). The RPC is used a fair bit in some projects.

1: https://github.com/elcritch/nesper/blob/devel/src/nesper/ser... 2: https://github.com/EmbeddedNim/fastrpc 3: https://hal.science/hal-03800577v1/file/Towards_a_Standard_T...

bccdee|1 year ago

What I really like about protobuf is the DDL. Really clear schema evolution rules. Ironclad types. Protobuf moves its complexity into things like default zero values, which are irritating but readily apparent. With json, it's superficially fine, but later on you discover that you need to be worrying about implementation-specific stuff like big ints getting mangled, or special parsing logic you need to set default values for string enums so that adding new values doesn't break backwards compatibility. Json-schema exists but really isn't built for these sorts of constraints, and if you try to use json-schema like protobuf, it can get pretty hairy.

Honestly, if protobuf just serialized to a strictly-specified subset of json, I'd be happy with that. I'm not in it for the fast ser/de, and something human-readable could be good. But when multiple services maintained by different teams are passing messages around, a robust schema language is a MASSIVE help. I haven't used Avro, but I assume it's similarly useful.

perezd|1 year ago

The better stack rn is buf + Connect RPC: https://connectrpc.com/ All the compatibility, you get JSON+HTTP & gRPC, one platform.

jeffrallen|1 year ago

Software lives forever. You have to take the long view, not the "rn" view. In the long view, NFS's XDR or ASN.1 are just fine and could have been enough, if we didn't keep reinventing things.

bbkane|1 year ago

Buf seems really nice, but I'm not completely sure what's free and what's not with the Buf platform, so I'm hesitant to make it a dependency for my little open source side project ideas. I should read the docs a bit more.

crabmusket|1 year ago

> I love the idea of an IDL for describing APIs and a great compiler/codegen (protoc)

Me too. My context is that I end up using RPC-ish patterns when doing slightly out-of-the-ordinary web stuff, like websockets, iframe communications, and web workers.

In each of those situations you start with a bidirectional communication channel, but you have to build your own request-response layer if you need that. JSON-RPC is a good place to start, because the spec is basically just "agree to use `id` to match up requests and responses" and very little else of note.

I've been looking around for a "minimum viable IDL" to add to that, and I think my conclusion so far is "just write out a TypeScript file". This works when all my software is web/TypeScript anyway.

dpeckett|1 year ago

Now that's an interesting thought, I wonder if you could use a modified subset of TypeScript to create a IDL/DDL for JSON-RPC. Then compile that schema into implementations for various target languages.

girvo|1 year ago

Same; at my previous job for the serialisation format for our embedded devices over 2G/4G/LoRaWAN/satellite I ended up landing on MessagePack, but that was partially because the "schema"/typed deserialisation was all in the same language for both the firmware and the server (Nim, in this case) and directly shared source-to-source. That won't work for a lot of cases of course, but it was quite nice for ours!

hansvm|1 year ago

> efficiency

State of the art for both gzipped json and protobufs is a few GB/s. Details matter (big strings, arrays, and binary data will push protos to 2x-10x faster in typical cases), but it's not the kind of landslide victory you'd get from a proper binary protocol. There isn't much need to feel like you're missing out.

jlouis|1 year ago

The big problem with Gzipped JSON is that once unzipped, it's gigantic. And you have to parse everything, even if you just need a few values. Just the memory bottleneck of having to munch through a string in JSON is going to slow down your parser by a ton. In contrast, a string in Protobuf is length-encoded.

5-10x is not uncommon, and that's kissing an order of magnitude difference.

imtringued|1 year ago

Yeah this is something people don't seem to want to get into their heads. If all you care is minimizing transferred bytes, then gzip+JSON is actually surprisingly competitive, to the point where you probably shouldn't even bother with anything else.

Meanwhile if you care about parsing speed, there is MessagePack and CBOR.

If any form of parsing is too expensive for you, you're better off with FlatBuffers and capnproto.

Finally there is the holy grail: Use JIT compilation to generate "serialization" and "deserialization" code at runtime through schema negotiation, whenever you create a long lived connection. Since your protocol is unique for every (origin, destination) architecture+schema tuple, you can in theory write out the data in a way that the target machine can directly interpret as memory after sanity checking the pointers. This could beat JSON, MessagePack, CBOR, FlatBuffers and capnproto in a single "protocol".

And then there is protobuf/grpc, which seems to be in this weird place, where it is not particularly good at anything.

lowbloodsugar|1 year ago

Except gzip is tragically slow, so crippling protobuf by running it through gzip could indeed slow it down to json speeds.

ajross|1 year ago

That's sort of where I've landed too. Protobufs would seem to fit the problem area well, but in practice the space between "big-system non-performance-sensitive data transfer metaformat"[1] and "super-performance-sensitive custom binary parser"[2] is... actually really small.

There are just very few spots that actually "need" protobuf at a level of urgency that would justify walking away from self-describing text formats (which is a big, big disadvantage for binary formats!).

[1] Something very well served by JSON

[2] Network routing, stateful packet inspection, on-the-fly transcoding. Stuff that you'd never think to use a "standard format" for.

bboygravity|1 year ago

Add "everything that communicates with a microcontroller" to 2.

That means potentially: the majority of devices in the world.

malkia|1 year ago

Apart from being text format, I'm not sure how well JSON-RPC handles doubles vs long integers and other types, where protobuf can be directed to handle them appropriately. That is a problem in JSON itself, so you may neeed to encode some numbers using... "string"

dpeckett|1 year ago

I'd say the success of REST kind of proves that's something that for the most part can be worked around. Often comes down to the JSON codec itself, many codecs will allow unmarshalling/marshalling fields straight into long int types.

Also JS now has BigInt types and the JSON decoder can be told to use them. So I'd argue it's kind of a moot point at this stage.

mirekrusin|1 year ago

Also json parsers are crazy fast nowadays, most people don't realize how fast they are.

Cthulhu_|1 year ago

While true, it's still a text and usually http/tcp based format; data -> json representation -> compression? -> http -> tcp -> decompression -> parsing -> data. Translating to / from a text just feels inefficient.