top | item 46508093

(no title)

Nathanba | 1 month ago

cool, do you think it's possible to add a schema mode to lite3 to remove the message size tradeoff? I think most people will still want to use lite3 with hard schemas during both serialization and deserialization. It's nice that it also works in a schemaless mode though.

discuss

eliasdejong|1 month ago

Being schemaless is deliberate design decision as it eliminates the need for managing and building schema files. By not requiring schema, messages are always readable to arbitrary consumers.

If you want schema, it must be done by the application through runtime type checking. All messages contain type information. Though I do see the value of adding pydantic-like schema checking in the future.

EDIT: Regarding message size, Lite³ does demand a message size penalty for being schemaless and fully indexed at the same time. Though if you are using it in an RPC / streaming setting, this can be negated through brotli/zstd/dict compression.

Nathanba|1 month ago

The thing about schemaless is that it's great for usability and I like it with JSON but as with JSON when we develop applications in reality at the end of the day you always have some kind of schema, whether it's written down or not. Like you alluded with pydantic, the application is going to rely on the data being in some sort of shape, even if it's very defensively written and practically everything is optional, you still end up relying something. That would be the informal schema so in my mind if I have a schema anyway no matter what... then maybe this should be supported in the serialization format/library to give me the benefit of size reductions.

digdugdirk|1 month ago

Pydantic was the first thing I thought of when I saw this. The possibilities are very intriguing.

Do you have any thoughts/recommendations for someone if they were to try making a pydantic interface layer for lite3?

eru|1 month ago

> By not requiring schema, messages are always readable to arbitrary consumers.

That sounds a bit silly..

> All messages contain type information.

That would (partially) enable what you were suggesting in the sentence I quoted first. But that's orthogonal to being schema-full or schema-less.

tignaj|1 month ago

If you are looking for a fast format with schema support see STEF: https://www.stefdata.net/

Disclosure: I am the author.