top | item 47104427

(no title)

throwa356262 | 8 days ago

"LLM backends: Anthropic, OpenAI, OpenRouter."

And here I was hoping that this was local inference :)

discuss

order

micw|8 days ago

Sure. Why purchase a H200 if you can go with an ESP32 ^^

sigmoid10|8 days ago

Blowing more than 800kb on essentially an http api wrapper is actually kinda bad. The original Doom binary was 700kb and had vastly more complexity. This is in C after all, so by stripping out nonessential stuff and using the right compiler options, I'd expect something like this to come in under 100kb.

__tnm|8 days ago

haha well I got something ridiculous coming soon for zclaw that will kinda work on board.. will require the S3 variant tho, needs a little more memory. Training it later today.

throwa356262|7 days ago

Sounds interesting, please keep us posted.

I dont think I have an S3, but plenty of C3. I thought they had the same amount of memory

peterisza|8 days ago

right, 888 kB would be impossible for local inference

however, it is really not that impressive for just a client

Dylan16807|8 days ago

It's not completely impossible, depending on what your expectations are. That language model that was built out of redstone in minecraft had... looks like 5 million parameters. And it could do mostly coherent sentences.

js8|8 days ago

I disagree, in the future it might be possible. But perhaps not in English, but in some more formal (yet fuzzy) language with some basic epistemology.

I mean, there is a lambda calculus self-interpreter in 29 bytes. How many additional logical rules are required for GAI inference? Maybe not that many as people think. Understanding about 1000 concepts of basic english (or say, lojban) might well be sufficient. It is possible this can be encoded in 800kB, we just don't know how.