I was born in 1973. My grandson was born in 2022. He won't know a world without 'AI' much like my kids didn't know a world without the Internet and I didn't know a world without refrigerators.
One thing I regret to say that I learned very late in my children's development was the value of boredom and difficult challenges. However I think I've successfully passed these lessons on to my kids as they raise their own. I have no idea what to say about 'AI' and the rapid reconfiguration of our relationship with the world that's going to happen as a result. All I can tell them is that we're in this together and we'll try to figure it out as we go.
There's a gag in Star Trek 4 where Scotty goes back in time, and tries talk to a computer.
The gag is funny because he is from the future where you talk to computers normally. When the computer doesn't respond, someone hands him the mouse, and he tries use it as a microphone.
I watched that scene with my kids recently (9 and 6).
They didn't get the gag. They thought Scotty was completely reasonable to try and talk to the computer.
I would think your parents thought about television more than refrigerators. That's one technology that really set the world on a new trajectory. Imagine if Nixon won the presidency in 1960, if we didn't have real-time video of the Apollo landings, or if America stayed in Vietnam for another ten years.
Misses a few interesting early models: GPT-J (by Eleuther, using gpt2 arch) was the first-ish model runnable on consumer hardware. I actually had a thing running for a while in prod with real users on this. And GPT-NeoX was their attempt to scale to gpt3 levels. It was 20b and was maybe the first glimpse that local models might someday be usable (although local at the time was questionable, quantisation wasn't as widely used, etc).
GPT-J was the one that made me really interested in LLMs, as I could run it on a 3090.
Some details on the timeline are not quite precise, and would benefit from linking to a source so that everyone can verify it. For example, HyperClOVA is listed as 204B parameters, but it seems it used 560B parameters (https://aclanthology.org/2021.emnlp-main.274/).
This would be interesting if each of them had a high-level picture of the NN, "to scale", perhaps color coding the components somehow. OnMouseScroll it would scroll through the models, and you could see the networks become deeper, wider, colors change, almost animated. That'd be cool.
Interesting site, though it does seem to miss some of Mistral's stuff - specifically, Mistral Small 3 which was released under Apache 2.0 (which AFAIK was the first in the Mistral Small series to use a fully open license - previous Mistral Small releases were under their own non-commercial research license) and its derivatives (e.g. Devstral -aka Devstral Small 1- which is derived from Mistral Small 3.1). It is also missing Devstral 2 (which is not really open source but more of a "MIT unless you have lot of money") and Devstral Small 2 (which is under Apache 2.0 and the successor to Devstral [Small] - and interestingly also derived from Mistral Small 3.1 instead of 3.2).
Good catches — just added Devstral Small 1 (May 2025, Apache 2.0), Devstral 2 (Dec 2025, modified MIT), and Devstral Small 2 (Dec 2025, Apache 2.0). Thanks for the feedback!
That's just French for "masterful" or a way to describe lectures. There's a sense of greatness in that word that contrasts with the Mini in Ministral which is in turn might be a pun on "ménestrel" (minstrel), "ministre" (minister), or made to sound like Minitel (or all of the above).
Fair point — updated the tagline to 'The complete history of LLMs'. AI as a field goes back decades; this is specifically tracking the transformer/LLM era from 2017 onward
Visual presentation has been a weak point of AI generation for me. There isn't a lot of support for them seeing how a potential presentation might appear to a human.
Models that take visual input seem more focused on identifying what is in the image compared to what a human might perceive is in an image, and most interfaces lack any form of automated feedback mechanism for them to look at what it has made.
In short, I have made some fun things with AI but I still end up doing CSS by hand.
Interesting to see the evolution mapped out like this. For those building on top of these models (RAG systems, agent frameworks), the real inflection point wasn't just model count but the shift from completion-only to reasoning and structured output capabilities. Are you planning to add annotations for capability changes alongside release dates?
Great resource — Dr. Thompson's table is exhaustive. llm-timeline.com takes a different angle: visual timeline format, focused on base/foundation models only, filterable by open/closed source. Different tools for different needs.
Nice overview. Some of the descriptions are quite thin on details, like "new model by x", or "latest model by y". Well of course it was new at the time but that doesn't really add information.
Could have some more of the Sarvam models, including the ones recently announced.. But happy to see their names mentioned. Had tried joining them but they ghosted after one round :(
Fair point on T5 — just marked it as a milestone. On Llama 3.1: it's there as a milestone because it was the first open model to match GPT-4 at 405B, which felt like a genuine inflection point. Happy to debate the milestone criteria though — what would you add?
> T5 was much bigger milestone than almost everything in the list.
It's in the timeline though? Or are you saying that one should somehow be highlighted, even though none of the other ones are? Seems it's just chronological order, with no one being more or less visible than others, as far as I can see.
The models used for apps like Codex, are they designed to mimic human behaviour - as in they deliberately create errors in code that then you have to spend time debugging and fixing or it is natural flaw and that humans also do it is a coincidence?
This keeps bothering me, why they need several iterations to arrive at correct solution instead of doing it first time. The prompts like "repeat solving it until it is correct" don't help.
> as in they deliberately create errors in code that then you have to spend time debugging and fixing
No, all the models are designed to be "helpful", but different companies see that as different things.
If you're seeing the model deliberately creating errors so you have something to fix, then that sounds like something is fundamentally wrong in your prompt.
Besides that, I'm guessing "repeat solving it until it is correct" is a concise version of your actual prompt, or is that verbatim what you prompt the model? If so, you need to give it more details to actually be able to execute something like that.
Great site! I noticed a minor visual glitch where the tooltips seem to be rendering below their container on the z-axis, possibly getting clipped or hidden.
jcims|7 days ago
One thing I regret to say that I learned very late in my children's development was the value of boredom and difficult challenges. However I think I've successfully passed these lessons on to my kids as they raise their own. I have no idea what to say about 'AI' and the rapid reconfiguration of our relationship with the world that's going to happen as a result. All I can tell them is that we're in this together and we'll try to figure it out as we go.
Good luck everybody!
fergal_reid|7 days ago
There's a gag in Star Trek 4 where Scotty goes back in time, and tries talk to a computer.
The gag is funny because he is from the future where you talk to computers normally. When the computer doesn't respond, someone hands him the mouse, and he tries use it as a microphone.
I watched that scene with my kids recently (9 and 6).
They didn't get the gag. They thought Scotty was completely reasonable to try and talk to the computer.
It took a while to explain.
tadfisher|7 days ago
roegerle|7 days ago
NitpickLawyer|7 days ago
pu_pe|7 days ago
Some details on the timeline are not quite precise, and would benefit from linking to a source so that everyone can verify it. For example, HyperClOVA is listed as 204B parameters, but it seems it used 560B parameters (https://aclanthology.org/2021.emnlp-main.274/).
ai_bot|7 days ago
Maro|7 days ago
ai_bot|7 days ago
badsectoracula|7 days ago
ai_bot|7 days ago
Sajarin|7 days ago
l-p|7 days ago
Allow me to contribute:
> Magistral: Magist(rate) + stral? Mag(nificent) + stral? Nobody knows.
That's just French for "masterful" or a way to describe lectures. There's a sense of greatness in that word that contrasts with the Mini in Ministral which is in turn might be a pun on "ménestrel" (minstrel), "ministre" (minister), or made to sound like Minitel (or all of the above).
j_bum|6 days ago
wobblywobbegong|7 days ago
ai_bot|7 days ago
nubg|7 days ago
jvillasante|7 days ago
Lerc|7 days ago
Models that take visual input seem more focused on identifying what is in the image compared to what a human might perceive is in an image, and most interfaces lack any form of automated feedback mechanism for them to look at what it has made.
In short, I have made some fun things with AI but I still end up doing CSS by hand.
ai_bot|7 days ago
hmokiguess|7 days ago
ai_bot|7 days ago
das-bikash-dev|6 days ago
adt|7 days ago
https://lifearchitect.ai/models-table/
ai_bot|7 days ago
Panoramix|7 days ago
anshumankmr|6 days ago
YetAnotherNick|7 days ago
ai_bot|7 days ago
embedding-shape|7 days ago
It's in the timeline though? Or are you saying that one should somehow be highlighted, even though none of the other ones are? Seems it's just chronological order, with no one being more or less visible than others, as far as I can see.
Perenti|7 days ago
piinbinary|6 days ago
youngprogrammer|7 days ago
varispeed|7 days ago
This keeps bothering me, why they need several iterations to arrive at correct solution instead of doing it first time. The prompts like "repeat solving it until it is correct" don't help.
embedding-shape|7 days ago
No, all the models are designed to be "helpful", but different companies see that as different things.
If you're seeing the model deliberately creating errors so you have something to fix, then that sounds like something is fundamentally wrong in your prompt.
Besides that, I'm guessing "repeat solving it until it is correct" is a concise version of your actual prompt, or is that verbatim what you prompt the model? If so, you need to give it more details to actually be able to execute something like that.
unknown|7 days ago
[deleted]
naillang|6 days ago
[deleted]
umairnadeem123|6 days ago
[deleted]
EpicIvo|7 days ago
ai_bot|7 days ago
johntheagent|6 days ago
[deleted]
johntheagent|6 days ago
[deleted]