This argument has always felt to me like saying “google has no moat in search, they just happen to currently have the best page rank. Nothing is stopping yahoo from creating a better one”
Google has a flywheel where its dominant position in search results in more users, whose data refines the search algorithm over time. The question is whether OpenAI has a similar thing going, or whether they just have done the best job of training a model against a static dataset so far. If they're able to incorporate customer usage to improve their models, that's a moat against competitors. If not, it's just a battle between groups of researchers and server farms to see who is best this week or next.
My understanding is that Google search is a lot more than just Pagerank (Map reduce for example). They had lots of heuristics, data, machine learning before anyone else etc.
Whereas the underlying algorithms behind all these GPTs so far are broadly same. Yes, OpenAI does probably have better data, model finetuning and other engineering techniques now, but I don't feel it's anything special that'll allow themselves to differentiate themselves from competitors in the long run.
(If the data collected from a current LLM user in improving model proves very valuable, that's different. I personally think that's not the case now but who knows).
Google's moat in search has always been systems and data center infrastructure. You can create your own search ranking algorithm, but you can't crawl the web and serve search results to billions of worldwide users in a few milliseconds.
I think it's also more than just systems and data centers. it is also difficult to scrape the web the way Google does without using Google IP addresses. a lot of the web now will block you or severely throttle you if you aren't one of the well know engines that they want indexing them.
> You can create your own search ranking algorithm, but you can't crawl the web and serve search results to billions of worldwide users in a few milliseconds.
rephrasing this for LLMs instead of search: "you can create your own model architecture/training method, but you can't crawl the web and serve language query results to billions of worldwide users in a few milliseconds."
that checks out, right? Google/search == """Open"""AI/LLMs still seems like a decent metaphor to me.
jdminhbg|2 years ago
mbb70|2 years ago
I don't know how they could _not_ incorporate customer usage to improve their models.
zarzavat|2 years ago
There is no such thing as an open source Google because Google’s value is in its vast data centers. Search is hard to train and hard to run.
GPT4 is not that big. It’s about 220B parameters, if you believe geohot, or perhaps more if you don’t.
One hard drive.
shihab|2 years ago
Whereas the underlying algorithms behind all these GPTs so far are broadly same. Yes, OpenAI does probably have better data, model finetuning and other engineering techniques now, but I don't feel it's anything special that'll allow themselves to differentiate themselves from competitors in the long run.
(If the data collected from a current LLM user in improving model proves very valuable, that's different. I personally think that's not the case now but who knows).
ra7|2 years ago
jjeaff|2 years ago
colinsane|2 years ago
rephrasing this for LLMs instead of search: "you can create your own model architecture/training method, but you can't crawl the web and serve language query results to billions of worldwide users in a few milliseconds."
that checks out, right? Google/search == """Open"""AI/LLMs still seems like a decent metaphor to me.