top | item 46023531

Tosijs-schema is a super lightweight schema-first LLM-native JSON schema library

46 points| podperson | 3 months ago |npmjs.com

28 comments

order

podperson|3 months ago

I wrote this library this weekend after realizing that Zod was really not designed for the use-cases I want JSON schemas for: 1) defining response formats for LLMs and 2) as a single source of truth for data structures.

7thpower|3 months ago

What led you to that conclusion?

kevmo314|3 months ago

> For large arrays (>97 items) and large dictionaries

How did we end up in a world where 97 items is considered large?

vages|3 months ago

Mind your off-by-1s: 97 items is not large, 98 is.

yunohn|3 months ago

> It checks a fixed sample of items (roughly 1%) regardless of size

> This provides O(1) performance

Wouldn’t 1% of N still imply O(N) performance?

podperson|3 months ago

N is increasing. O(1) means constant (actually capped). We never check more than 100 items.

bbminner|3 months ago

While llms accept json schemas for constrained decoding, they might not respect all of the constraints.

_heimdall|3 months ago

Had you considered using something like XML as the transport format rather than JSON? If the UX is similar to zod it wouldn't matter what the underlying data format is, and XML is meant to support schemas unlike JSON.

podperson|3 months ago

JSON Schema is a schema built on JSON and it’s already being used. Using XML would mean converting the XML into JSON schema to define the response from the LLM.

That said, JSON is “language neutral” but also super convenient for JavaScript developers and typically more convenient for most people than XML.