> Meta released Llama-3 only three days ago, and it already feels like the inflection point when open source models finally closed the gap with proprietary models. The benchmarks show that Llama-3 70B matches GPT-4 and Claude Opus in most tasks, and the even more powerful Llama-3 400B+ model is still training.
I'm all for open models, but where do the benchmarks show that?
Looking at https://arena.lmsys.org/, Llama-3-70b-Instruct is ranked #5 while current GPT-4 models and Claude Opus are still tied at #1. Meanwhile, Llama-3-8b-Instruct is ranked #14
Would love to be corrected, but either way an article should include sources for these types of claims.
It's above Opus in second position in the "English" only category. It probably suffers in the overall score due to poor multilingual ability (afaik 95% of its training data was English only).
Though usual caveat about small sample size applies, as of now the CI is fairly big. It's also not at the level of those two in "Code" category, I hope Meta will give the CodeLlama variant an update again.
If they did or didn't is up for debate, but I'm pretty sure scorched earth was their goal from the start (specifically targeting the giants, Google and OpenAI). They don't want to be #3 in this race.
Every big new tech category in recent years has quickly matured into a duopoly of a closed, proprietary, premium option and a mass market, public, "open" one. OpenAI, Anthropic, Google, Amazon, Microsoft and a ton of other players are battling for the #1 option, but Meta has cleverly theorized that #2 is going to be easier and a better fit for them.
There was a clear leader 18 months ago. Then there was a lot of catchup by everyone.
It’s no longer “hard” build a product like ChatGPT + GPT 3.5, and that would include creating the model. There’s a few trade secrets, but it seems like not much beyond that protecting an ~8 month mote.
The leader always wants to be a monopoly. The distant runners up can catch up the most ground by playing nice, being open source, and working with other companies to erode the market share that the leader wanted to lock away.
I'm so glad to see this happening. I'm terrified of a single company winning all of AI. It's starting to look like this won't happen and that OpenAI is simply stretched too thin.
If this pattern holds, OpenAI may wind up as a footnote. Their inability to open up and work with others means that competitors and would-be collaborators will choose the open alternatives.
Meta can win that mindshare and be the friendly facilitator and rails that an entire ecosystem of business is built upon. OpenAI will never be that. They're not "open" enough.
Open models weren’t Meta’s original plan. The LLaMA 1 model was only available to Meta-approved researchers until someone leaked the model weights on 4chan in 2023. Meta issued DMCA takedown requests to HuggingFace and GitHub.
I was hoping for more detail in the post but it really just a seems like a quick and easy way to attract eyeballs? Never heard of the platform, and I cant help but think that the logo is a rip of the old okta brand.
Perhaps for now, but I wouldn't count on there always being a Meta spending $$$ to train enormous models and then giving them away for free. What's the long-term game plan for open-source models when the corporate charity inevitably dries up?
If your product is an AI model (OpenAI, Anthropic, etc) you can't give it away for free.
If your product is a social graph w/ ads (Meta), you can.
It's hardly corporate charity:
* Meta releasing these models creates an improvement and tuning ecosystem around it, giving them access to tons of free developer time.
* It's also a strong recruiting tool, for engineers and researchers frustrated by, e.g., Google and OpenAI becoming increasingly closed. They know they can publish at Meta.
* The cost is insignificant. Meta had 30B in revenue just in Q2 2023.
Building open models is a very strong approach to cornering the market on top tier AI researchers. And as other commenters have mentioned, the raw models are not the product - the vast majority of the value is in how they are integrated into useful products.
The corporate charity will not dry up. AI makes it easier to generate content, and Meta's in the business of facilitating the sharing of that content. Content is surface area for ads. AI will also make the virtual realities of the "metaverse", as defined by Mark, easier to reify. It's also a giant marketing and recruiting strategy.
> What’s the long-term game plan for open-source models when the corporate charity inevitably dries up?
Once open models reach and stay at near parity for a while, it’ll make sense for commercial downstream users to support open source community efforts rather than building their own, same as has happened in many other categories of key infrastructure software.
Unless Meta’s bet is that, going forward, models themselves won’t be the competitive differentiator, that it will be about integration. They can give away Llama3,4,5 for free, because no one else can put them Whatsapp or whatever.
I guess many here can think a lot 5D chess business strategies. For me it is just Zuck trying to reach greatness, down in history his name will prob pop up when people think about who brought LLM/AI/AGI? to the masses.
8k context is tiny compared to what’s out there. They promise much larger contexts but until then it can’t even reliably summarize every web page out there.
It hasn't. My guess is that 90% of the folks using proprietary models through chat interface have no need or wants to run their own locally nor do they have the hardware. For those of us who wish to run our models and do have the hardware, I would reckon maybe 20% of us using proprietary models will drop them. The latest models released are very good, but I'm not convinced they are GPT/Claude quality yet.
Llama-3 is semi-proprietary though. It's definitely not open source!!
Has anything changed in the last 9 months: https://news.ycombinator.com/item?id=36815255 ? Is there better access to anything more than weights? Can we now train new models using llama? Are we no longer as restricted in use?
Haven't seen any comments questioning the premise here. It seems pretty constrained what we can do and how built-atoppable Llama is. I like the idea of it as a safeguard against control by some giant but I'm not sure if it's a big enough grant of rights to be something we can built atop.
Llama 3 is just as restricted as Llama 2. The licenses and acceptable use policies are almost identical; the only real change consists of added attribution requirements. Products that use Llama 3 have to "prominently display 'Built with Meta Llama 3' on a related website, user interface, blogpost, about page, or product documentation", and models based on Llama 3 have to "include 'Llama 3' at the beginning of any such AI model name."
Commoditization of the AGI models is inevitable. OpenAI is building the next generation of compound AI systems — an LLM OS, if you will. it does more than just predict the next token. It figures out the right function and service to call when needed. That’s where the arms race has moved to. Meta isn’t going to build a free LLM OS. The comparison is apples to oranges.
GPT4 was released a year ago, and trained two years ago. I wouldn’t sleep on OpenAi, especially how confident Sam Altman sounded in his Lex Friedman podcast.
The real race is to AGI anyways, as whoever gets there first will immediately capture 100% of the market.
Llama 3 is not beating GPT4 in benchmarks I've seen, and it's not beating it on LLM Arena. That's all that really matters. It needs to beat GPT4 in the benchmarks and leader boards, or it's a nothingburger, as far as OpenAI's dominance goes.
This article will age poorly when GPT-5 is released /s There will be always a market for proprietary models that is the best, though it’s unclear whether that will be OpenAI’s
The brilliance of Meta's strategy here is: if they offer a (F?)OSS model at near-performance compared to leaders, they commoditize the product of their would-be competitors. Meta doesn't have to make money on API calls, but they could face an existential risk if someone else built the everyday AI companion of the future (e.g. users on ChatGPT UI, microsfoft's much advertised "copilot for everyday").
So -- a defensive play with some positive externalities (e.g. developer ecosystem mindshare + roadmap control, ability to use within their own products at cost, without giving up margin to suppliers).
I tested Llama-3-70b-8192 on Groq against ChatGPT 4, and while Groq ran it super fast, it hallucinated one answer, and didn’t get the logic correct on another question.
So, ChatGPT 4 is still more reliable for my use case. But if I were to want an LLM to process data, summarize, and so forth, Llama-3 on Groq is very fast.
Questions:
Do you know anything about Intel Hala Point?
Groq: bullshit, but admitted it when I called it out.
ChatGPT: did a Bing search (it knew what it didn’t know).
Question 2a (separate chat):
If you’re in Canada, what’s the best way to use a TFSA?
2b: Okay, if your portfolio has some tech stocks, some cash cows, and some government bonds, which should be allocated to the TFSA?
The reason I chose Question 2 is that most banks are happy to recommend bad products if it benefits them. Llama-3’s answer reflects the bank bullshit. ChatGPT 4 gives the advice your trustworthy and financially savvy friend would give you.
Follow-on questions for Llama-3:
2c: You have it backwards.
2d: Why did you get it backwards? Were you influenced by the glut of “advice” proffered by banks?
[+] [-] airstrike|1 year ago|reply
I'm all for open models, but where do the benchmarks show that?
The official page https://llama.meta.com/llama3/ does not show any comparisons with GPT-4 or Claude Opus
Looking at https://arena.lmsys.org/, Llama-3-70b-Instruct is ranked #5 while current GPT-4 models and Claude Opus are still tied at #1. Meanwhile, Llama-3-8b-Instruct is ranked #14
Would love to be corrected, but either way an article should include sources for these types of claims.
[+] [-] rb2k_|1 year ago|reply
Direct image link: https://scontent-atl3-2.xx.fbcdn.net/v/t39.2365-6/439015366_...)
[+] [-] natrys|1 year ago|reply
Though usual caveat about small sample size applies, as of now the CI is fairly big. It's also not at the level of those two in "Code" category, I hope Meta will give the CodeLlama variant an update again.
[+] [-] gtt|1 year ago|reply
Additionally, how much better is Claude for coding?
[+] [-] afro88|1 year ago|reply
400B does look set to meet GPT-4, which will be exciting, but it's not finished yet.
[+] [-] willsmith72|1 year ago|reply
[+] [-] geor9e|1 year ago|reply
[+] [-] paxys|1 year ago|reply
[+] [-] prpl|1 year ago|reply
It’s no longer “hard” build a product like ChatGPT + GPT 3.5, and that would include creating the model. There’s a few trade secrets, but it seems like not much beyond that protecting an ~8 month mote.
[+] [-] echelon|1 year ago|reply
The leader always wants to be a monopoly. The distant runners up can catch up the most ground by playing nice, being open source, and working with other companies to erode the market share that the leader wanted to lock away.
I'm so glad to see this happening. I'm terrified of a single company winning all of AI. It's starting to look like this won't happen and that OpenAI is simply stretched too thin.
If this pattern holds, OpenAI may wind up as a footnote. Their inability to open up and work with others means that competitors and would-be collaborators will choose the open alternatives.
Meta can win that mindshare and be the friendly facilitator and rails that an entire ecosystem of business is built upon. OpenAI will never be that. They're not "open" enough.
[+] [-] cpeterso|1 year ago|reply
[+] [-] greggsy|1 year ago|reply
[+] [-] jsheard|1 year ago|reply
[+] [-] ozr|1 year ago|reply
If your product is a social graph w/ ads (Meta), you can.
It's hardly corporate charity:
* Meta releasing these models creates an improvement and tuning ecosystem around it, giving them access to tons of free developer time.
* It's also a strong recruiting tool, for engineers and researchers frustrated by, e.g., Google and OpenAI becoming increasingly closed. They know they can publish at Meta.
* The cost is insignificant. Meta had 30B in revenue just in Q2 2023.
[+] [-] jeffnappi|1 year ago|reply
[+] [-] polio|1 year ago|reply
[+] [-] dragonwriter|1 year ago|reply
Once open models reach and stay at near parity for a while, it’ll make sense for commercial downstream users to support open source community efforts rather than building their own, same as has happened in many other categories of key infrastructure software.
[+] [-] sharkjacobs|1 year ago|reply
idk
[+] [-] lunfardl|1 year ago|reply
[+] [-] mepian|1 year ago|reply
[+] [-] seydor|1 year ago|reply
[+] [-] pointlessone|1 year ago|reply
[+] [-] tarruda|1 year ago|reply
The authors claim this method was used to extend Llama 2 to 128k: https://github.com/jquesnelle/yarn
[+] [-] segmondy|1 year ago|reply
[+] [-] nojvek|1 year ago|reply
So even if they are losing a bit on infra and training investment, Meta is relevant again. They are cool again.
Meta made a huge comeback.
[+] [-] jauntywundrkind|1 year ago|reply
Has anything changed in the last 9 months: https://news.ycombinator.com/item?id=36815255 ? Is there better access to anything more than weights? Can we now train new models using llama? Are we no longer as restricted in use?
Haven't seen any comments questioning the premise here. It seems pretty constrained what we can do and how built-atoppable Llama is. I like the idea of it as a safeguard against control by some giant but I'm not sure if it's a big enough grant of rights to be something we can built atop.
[+] [-] comex|1 year ago|reply
[+] [-] unknown|1 year ago|reply
[deleted]
[+] [-] mehulashah|1 year ago|reply
[+] [-] EcommerceFlow|1 year ago|reply
The real race is to AGI anyways, as whoever gets there first will immediately capture 100% of the market.
[+] [-] dragonwriter|1 year ago|reply
How confident Sam Altman sounds doesn’t figure much into my assessment’s of reality, other than the reality of Sam Altman’s promotional skills.
> The real race is to AGI anyways, as whoever gets there first will immediately capture 100% of the market.
AGI has no actual objective definition, and nothing supports this beyond naked conjecture and quasi-religious dogma.
[+] [-] barfbagginus|1 year ago|reply
Llama 3 is not beating GPT4 in benchmarks I've seen, and it's not beating it on LLM Arena. That's all that really matters. It needs to beat GPT4 in the benchmarks and leader boards, or it's a nothingburger, as far as OpenAI's dominance goes.
[+] [-] nextworddev|1 year ago|reply
[+] [-] thorum|1 year ago|reply
Good 7B/8B models are still really useful but let’s not be hyperbolic.
[+] [-] unknown|1 year ago|reply
[deleted]
[+] [-] Havoc|1 year ago|reply
The 8B model seems particularly good at summarization tasks.
[+] [-] mushufasa|1 year ago|reply
So -- a defensive play with some positive externalities (e.g. developer ecosystem mindshare + roadmap control, ability to use within their own products at cost, without giving up margin to suppliers).
[+] [-] bagels|1 year ago|reply
[+] [-] drivingmenuts|1 year ago|reply
[+] [-] colinng|1 year ago|reply
So, ChatGPT 4 is still more reliable for my use case. But if I were to want an LLM to process data, summarize, and so forth, Llama-3 on Groq is very fast.
Questions:
Do you know anything about Intel Hala Point?
Groq: bullshit, but admitted it when I called it out. ChatGPT: did a Bing search (it knew what it didn’t know).
Question 2a (separate chat): If you’re in Canada, what’s the best way to use a TFSA?
2b: Okay, if your portfolio has some tech stocks, some cash cows, and some government bonds, which should be allocated to the TFSA?
The reason I chose Question 2 is that most banks are happy to recommend bad products if it benefits them. Llama-3’s answer reflects the bank bullshit. ChatGPT 4 gives the advice your trustworthy and financially savvy friend would give you.
Follow-on questions for Llama-3:
2c: You have it backwards.
2d: Why did you get it backwards? Were you influenced by the glut of “advice” proffered by banks?
[+] [-] ladzoppelin|1 year ago|reply
[+] [-] rrr_oh_man|1 year ago|reply
— Betteridge's law of headlines (https://w.wiki/3b$V)
[+] [-] rmellow|1 year ago|reply
As an online discussion about a headline that ends in a question mark grows longer, the probability of a citation of Betteridges's law approaches 1.
[+] [-] franze|1 year ago|reply