I've seen a rumor going around that OpenAI hasn't had a successful pre-training run since mid 2024. This seemed insane to me but if you give ChatGPT 5.1 a query about current events and instruct it not to use the internet it will tell you its knowledge cutoff is June 2024. Not sure if maybe that's just the smaller model or what. But I don't think it's a good sign to get that from any frontier model today, that's 18 months ago.
alecco|2 months ago
https://newsletter.semianalysis.com/p/tpuv7-google-takes-a-s...
RossBencina|2 months ago
"OpenAI’s leading researchers have not completed a successful full-scale pre-training run that was broadly deployed for a new frontier model since GPT-4o in May 2024, highlighting the significant technical hurdle that Google’s TPU fleet has managed to overcome."
Given the overall quality of the article, that is an uncharacteristically convoluted sentence. At the risk of stating the obvious, "that was broadly deployed" (or not) is contingent on many factors, most of which are not of the GPU vs. TPU technical variety.
binkHN|2 months ago
CamperBob2|2 months ago
rahimnathwani|2 months ago
unknown|2 months ago
[deleted]
mvkel|2 months ago
Their own press releases confirm this. They call 5 their best new "ai system", not a new model
https://openai.com/index/introducing-gpt-5/
krackers|2 months ago
Davidzheng|2 months ago
staticman2|2 months ago
It certainly was much dumber than 4o on Perplexity when I tried it.
m3kw9|2 months ago
Forgeties79|2 months ago
Hardly a hot take. People have theorized about the ouroboros effect for years now. But I do wonder if that’s part of the problem
p1necone|2 months ago
But I always realize it's just smoke and mirrors - the actual quality of the code and the failure modes and stuff are just so much worse than claude and gemini.
kshacker|2 months ago
And I write some code for my personal enjoyment, and I gave it to Claude 6-8 months back for improvement, it gave me a massive change log and it was quite risky so abandoned it.
I tried this again with Gemini last week, I was more prepared and asked it to improve class by class, and for whatever reasons I got better answers -- changed code, with explanations, and when I asked it to split the refactor in smaller steps, it did so. Was a joy working on this over the thanksgiving holidays. It could break the changes in small pieces, talk through them as I evolved concepts learned previously, took my feedback and prioritization, and also gave me nuanced explanation of the business objectives I was trying to achieve.
This is not to downplay claude, that is just the sequence of events narration. So while it may or may not work well for experienced programmers, it is such a helpful tool for people who know the domain or the concepts (or both) and struggle with details, since the tool can iron out a lot of details for you.
My goal now is to have another project for winter holidays and then think through 4-6 hour AI assisted refactors over the weekends. Do note that this is a project of personal interest so not spending weekends for the big man.
tartoran|2 months ago
herpdyderp|2 months ago
findjashua|2 months ago
stevedonovan|2 months ago
So (again) we are just sharing anecdata
sharyphil|2 months ago
Somehow it doesn't get on my nerves (unlike Gemini with "Of course").
jpalomaki|2 months ago
Interested, because I’ve been getting pretty good results with different tasks using the Codex.
logicchains|2 months ago
bsder|2 months ago
The problem is that the "AI"s can cough up code examples based upon proprietary codebases that you, as an individual, have no access to. That creates a significant quality differential between coders who only use publicly available search (Google, Github, etc.) vs those who use "AI" systems.
CheeseFromLidl|2 months ago
Which makes sense for something that isn’t AI but LLM.
xnx|2 months ago
nickff|2 months ago
echelon|2 months ago
The 25x revenue multiple wouldn't be so bad if they weren't burning so much cash on R&D and if they actually had a moat.
Google caught up quick, the Chinese are spinning up open source models left and right, and the world really just isn't ready to adopt AI everywhere yet. We're in the premature/awkward phase.
They're just too early, and the AGI is just too far away.
Doesn't look like their "advertising" idea to increase revenue is working, either.
redbluered|2 months ago
As a shady for-profit, there is none. That's the problem with this particular fraud.
savrajsingh|2 months ago
mikepurvis|2 months ago
impulser_|2 months ago
rvnx|2 months ago
Also their models get dumber and dumber over time.
wrcwill|2 months ago
https://platform.openai.com/docs/models/compare?model=gpt-5....
amluto|2 months ago
Coneylake|2 months ago
nextworddev|2 months ago
nickysielicki|2 months ago
I followed him on Twitter. He said some very interesting things, I thought. Then he started talking about the niche of ML/AI I work near, and he was completely wrong about it. I became enlightened.
searls|2 months ago
I didn't make this connection that the training data is that old, but that would indeed augur poorly.
hn_throwaway_99|2 months ago
Now I don't know if this means that OpenAI was able to add that 3 months of data to earlier models by tuning or if it was a "from scratch" pre-training run, but it has to be a substantial difference in the models.
mr_00ff00|2 months ago
nodja|2 months ago
Pre-training: You train on a vast amount of data, as varied and high quality as possible, this will determine the distribution the model can operate with, so LLMs are usually trained on a curated dataset of the whole internet, the output of the pre-training is usually called the base model.
Post-training: You narrow down the task by training on the specific model needs you want. You can do this through several ways:
- Supervised Finetuning (SFT): Training on a strict high quality dataset of the task you want. For example if you wanted a summarization model, you'd finetune the model on high quality text->summary pairs and the model would be able to summarize much better than the base model.
- Reinforcement Learning (RL): You train a separate model that ranks outputs, then use it to rate the output of the model, then use that data to train the model.
- Direct Preference Optimizaton (DPO): You have pairs of good/bad generations and use them to align the model towards/away the kinds of responses you want.
Post-training is what makes the models able to be easily used, the most common is instruction tuning that teaches to model to talk in turns, but post-training can be used for anything. E.g. if you want a translation model that always translates a certain way, or a model that knows how to use tools, etc. you'd achieve all that through post-training. Post-training is where most of the secret sauce in current models is nowadays.
tim333|2 months ago
abixb|2 months ago
bckr|2 months ago
fovc|2 months ago
https://youtu.be/3K-R4yVjJfU?si=JdVyYOlxUbEcvEEo&t=2624
> Q: Are the releases aligned with pre-training efforts?
> A: There used to be a time not that long ago, maybe half a year, distant past, where the models would align with RL runs or pretraining runs ... now the naming is by capability. GPT5 is a capable model; 5.1 is a more capable model
jimbohn|2 months ago
kristianp|2 months ago
f311a|2 months ago
manmal|2 months ago
mips_avatar|2 months ago
throwaway314155|2 months ago
octoberfranklin|2 months ago
Helps you get useful answers like "I don't know that's too recent" when you ask questions like "who won the basketball game last night".
simianparrot|2 months ago
bakugo|2 months ago
unknown|2 months ago
[deleted]
mingodad|2 months ago
====
when have you been built?
I’m a GPT‑4‑based model that OpenAI released on March 14 2023. The underlying training data goes up to early 2023, and the model was trained in the months leading up to that release. If you’re asking about the ChatGPT product that ships the model to users, it went live in November 2022 and has since received updates (GPT‑3.5, GPT‑4, etc.) that keep it current.
====
But when supposedly running it from https://huggingface.co/chat/models/openai/gpt-oss-20b:
====
when have you been built?
I’m a language model created by OpenAI. The current generation (GPT‑4) that powers this chat was first released in March 2023 and has been updated and fine‑tuned up through the end of 2024. My training data runs up to the beginning of June 2025, so I’m built on knowledge available up to that point.
====
And that makes me thinking that although https://huggingface.co/chat claims to be using the models available to public at https://huggingface.co , it doesn't seems to be true and I raised this question here https://huggingface.co/ggml-org/gpt-oss-20b-GGUF/discussions... , https://github.com/huggingface/inference-playground/issues/1... and https://github.com/ggml-org/llama.cpp/discussions/15396#disc... .