(no title)
thomasfedb | 11 months ago
What makes a file non-static (dynamic?) other than +x?
Both are instructions about how to perform a computation. Both require other software/hardware/microcode to run. In general, the stack is tall!
Even so, I do agree that “a bunch of matrices” feels different to “a bunch of instructions” - although arguably the former may be closer in architecture to the greatest computing machine we know (the brain) than the latter.
</armchair>
wongarsu|11 months ago
There is a lot happening between a model file sitting on a disk and serving it in an API with attached playground, billing, abuse handling, etc, handling the load of thousands or millions of users calling these incredibly demanding programs. A lot of clever software, good hardware, even down to acquiring buildings and dealing with the order backlog for backup diesel generators.
Improvements in that layer were a large part of what OpenAI to go from the relative obscurity of GPT3.5 to generating massive hype with a ChatGPT anyone could try at a whim. As a more recent example x.ai seems to be struggling with that layer a lot right now. Grok3 is pretty good, but has almost daily partial outages. The 1M context model is promised but never rolls out, instead on some days the served context size is even less than the usual 64k. And they haven't even started making it available on the API.
All of this will be easy when we reach the point where everyone can run powerful LLMs on their own device, but for now just having a 400B parameter model sitting on your hard drive doesn't get your business very far
bambax|11 months ago