top | item 46085091

Ask HN: How do you inspect Avro/Protobuf records during debugging?

4 points| conqueso | 3 months ago

Not a data engineer, but I’ve worked with Avro a tiny bit and it quickly became obvious that manually inspecting payloads would not be quick and easy w/o some custom tooling.I’m curious how DEs actually do this in the real world?

For instance, say you’ve got an Avro or Protobuf payload and you’re not sure which schema version it used, how do you inspect the actual record data? Do you just write a quick script? Use avro-tools/protoc? Does your team have internal tools for this?

I'm trying to determine if it'd be worth building a visual inspector where you could drop in the data + schema (or have it detected) and just browse the decoded fields. But maybe that’s not something people need often? Genuinely curious what the usual debugging workflow is. Any input from experienced DEs would be greatly appreciated!

4 comments

order

tliltocatl|3 months ago

Not a DE either, haven't been thru Avro, but I have worked with PB wire format quite a lot. Here's what I've used:

https://gchq.github.io/CyberChef/ - data transcoder pipeline, can do protobuf, among others. Very useful when you are debugging a custom encoder.

https://github.com/nccgroup/blackboxprotobuf - decodes wire format without a scheme.

conqueso|3 months ago

Oh nice, thanks for these! Since you’ve worked with PB a lot - do you usually jump b/w different tools depending on the format (PB, Avro, JSON, etc.)? Or would having a single place to inspect multiple formats actually make debugging easier?