Has Llama-3 just killed proprietary AI models?

[+] airstrike|1 year ago|reply

> Meta released Llama-3 only three days ago, and it already feels like the inflection point when open source models finally closed the gap with proprietary models. The benchmarks show that Llama-3 70B matches GPT-4 and Claude Opus in most tasks, and the even more powerful Llama-3 400B+ model is still training.

I'm all for open models, but where do the benchmarks show that?

The official page https://llama.meta.com/llama3/ does not show any comparisons with GPT-4 or Claude Opus

Looking at https://arena.lmsys.org/, Llama-3-70b-Instruct is ranked #5 while current GPT-4 models and Claude Opus are still tied at #1. Meanwhile, Llama-3-8b-Instruct is ranked #14

Would love to be corrected, but either way an article should include sources for these types of claims.

[+] rb2k_|1 year ago|reply

Some Llama3 400B numbers for the April 15th checkpoint are listed at https://ai.meta.com/blog/meta-llama-3/

Direct image link: https://scontent-atl3-2.xx.fbcdn.net/v/t39.2365-6/439015366_...)

[+] natrys|1 year ago|reply

It's above Opus in second position in the "English" only category. It probably suffers in the overall score due to poor multilingual ability (afaik 95% of its training data was English only).

Though usual caveat about small sample size applies, as of now the CI is fairly big. It's also not at the level of those two in "Code" category, I hope Meta will give the CodeLlama variant an update again.

[+] gtt|1 year ago|reply

How significant is the difference in Elo scores for the end user? Should I use Claude Opus if I already have GPT4, despite its similar score?

Additionally, how much better is Claude for coding?

[+] afro88|1 year ago|reply

Came here to say this. Thought I missed something!

400B does look set to meet GPT-4, which will be exciting, but it's not finished yet.

[+] willsmith72|1 year ago|reply

Yeah I came to post the same thing. It's nothing like gpt4, just great for the size.

[+] geor9e|1 year ago|reply

If they did or didn't is up for debate, but I'm pretty sure scorched earth was their goal from the start (specifically targeting the giants, Google and OpenAI). They don't want to be #3 in this race.

[+] paxys|1 year ago|reply

Every big new tech category in recent years has quickly matured into a duopoly of a closed, proprietary, premium option and a mass market, public, "open" one. OpenAI, Anthropic, Google, Amazon, Microsoft and a ton of other players are battling for the #1 option, but Meta has cleverly theorized that #2 is going to be easier and a better fit for them.

[+] prpl|1 year ago|reply

There was a clear leader 18 months ago. Then there was a lot of catchup by everyone.

It’s no longer “hard” build a product like ChatGPT + GPT 3.5, and that would include creating the model. There’s a few trade secrets, but it seems like not much beyond that protecting an ~8 month mote.

[+] echelon|1 year ago|reply

It's the smartest possible play.

The leader always wants to be a monopoly. The distant runners up can catch up the most ground by playing nice, being open source, and working with other companies to erode the market share that the leader wanted to lock away.

I'm so glad to see this happening. I'm terrified of a single company winning all of AI. It's starting to look like this won't happen and that OpenAI is simply stretched too thin.

If this pattern holds, OpenAI may wind up as a footnote. Their inability to open up and work with others means that competitors and would-be collaborators will choose the open alternatives.

Meta can win that mindshare and be the friendly facilitator and rails that an entire ecosystem of business is built upon. OpenAI will never be that. They're not "open" enough.

[+] cpeterso|1 year ago|reply

Open models weren’t Meta’s original plan. The LLaMA 1 model was only available to Meta-approved researchers until someone leaked the model weights on 4chan in 2023. Meta issued DMCA takedown requests to HuggingFace and GitHub.

[+] greggsy|1 year ago|reply

I was hoping for more detail in the post but it really just a seems like a quick and easy way to attract eyeballs? Never heard of the platform, and I cant help but think that the logo is a rip of the old okta brand.

[+] jsheard|1 year ago|reply

Perhaps for now, but I wouldn't count on there always being a Meta spending $$$ to train enormous models and then giving them away for free. What's the long-term game plan for open-source models when the corporate charity inevitably dries up?

[+] ozr|1 year ago|reply

If your product is an AI model (OpenAI, Anthropic, etc) you can't give it away for free.

If your product is a social graph w/ ads (Meta), you can.

It's hardly corporate charity:

* Meta releasing these models creates an improvement and tuning ecosystem around it, giving them access to tons of free developer time.

* It's also a strong recruiting tool, for engineers and researchers frustrated by, e.g., Google and OpenAI becoming increasingly closed. They know they can publish at Meta.

* The cost is insignificant. Meta had 30B in revenue just in Q2 2023.

[+] jeffnappi|1 year ago|reply

Building open models is a very strong approach to cornering the market on top tier AI researchers. And as other commenters have mentioned, the raw models are not the product - the vast majority of the value is in how they are integrated into useful products.

[+] polio|1 year ago|reply

The corporate charity will not dry up. AI makes it easier to generate content, and Meta's in the business of facilitating the sharing of that content. Content is surface area for ads. AI will also make the virtual realities of the "metaverse", as defined by Mark, easier to reify. It's also a giant marketing and recruiting strategy.

[+] dragonwriter|1 year ago|reply

> What’s the long-term game plan for open-source models when the corporate charity inevitably dries up?

Once open models reach and stay at near parity for a while, it’ll make sense for commercial downstream users to support open source community efforts rather than building their own, same as has happened in many other categories of key infrastructure software.

[+] sharkjacobs|1 year ago|reply

Unless Meta’s bet is that, going forward, models themselves won’t be the competitive differentiator, that it will be about integration. They can give away Llama3,4,5 for free, because no one else can put them Whatsapp or whatever.

idk

[+] lunfardl|1 year ago|reply

I guess many here can think a lot 5D chess business strategies. For me it is just Zuck trying to reach greatness, down in history his name will prob pop up when people think about who brought LLM/AI/AGI? to the masses.

[+] mepian|1 year ago|reply

Maybe they are commoditizing their complement here.

[+] seydor|1 year ago|reply

They spent much more on metaverse

[+] pointlessone|1 year ago|reply

8k context is tiny compared to what’s out there. They promise much larger contexts but until then it can’t even reliably summarize every web page out there.

[+] tarruda|1 year ago|reply

There are techniques to extend the context window via fine tuning.

The authors claim this method was used to extend Llama 2 to 128k: https://github.com/jquesnelle/yarn

[+] segmondy|1 year ago|reply

It hasn't. My guess is that 90% of the folks using proprietary models through chat interface have no need or wants to run their own locally nor do they have the hardware. For those of us who wish to run our models and do have the hardware, I would reckon maybe 20% of us using proprietary models will drop them. The latest models released are very good, but I'm not convinced they are GPT/Claude quality yet.

[+] nojvek|1 year ago|reply

After Apple’s ad tracking blocking, Meta was worth $500B. After they announced they are full in on AI and want to be too they are now worth $1.2T.

So even if they are losing a bit on infra and training investment, Meta is relevant again. They are cool again.

Meta made a huge comeback.

[+] jauntywundrkind|1 year ago|reply

Llama-3 is semi-proprietary though. It's definitely not open source!!

Has anything changed in the last 9 months: https://news.ycombinator.com/item?id=36815255 ? Is there better access to anything more than weights? Can we now train new models using llama? Are we no longer as restricted in use?

Haven't seen any comments questioning the premise here. It seems pretty constrained what we can do and how built-atoppable Llama is. I like the idea of it as a safeguard against control by some giant but I'm not sure if it's a big enough grant of rights to be something we can built atop.

[+] comex|1 year ago|reply

Llama 3 is just as restricted as Llama 2. The licenses and acceptable use policies are almost identical; the only real change consists of added attribution requirements. Products that use Llama 3 have to "prominently display 'Built with Meta Llama 3' on a related website, user interface, blogpost, about page, or product documentation", and models based on Llama 3 have to "include 'Llama 3' at the beginning of any such AI model name."

[+] unknown|1 year ago|reply

[deleted]

[+] mehulashah|1 year ago|reply

Commoditization of the AGI models is inevitable. OpenAI is building the next generation of compound AI systems — an LLM OS, if you will. it does more than just predict the next token. It figures out the right function and service to call when needed. That’s where the arms race has moved to. Meta isn’t going to build a free LLM OS. The comparison is apples to oranges.

[+] EcommerceFlow|1 year ago|reply

GPT4 was released a year ago, and trained two years ago. I wouldn’t sleep on OpenAi, especially how confident Sam Altman sounded in his Lex Friedman podcast.

The real race is to AGI anyways, as whoever gets there first will immediately capture 100% of the market.

[+] dragonwriter|1 year ago|reply

> I wouldn’t sleep on OpenAi, especially how confident Sam Altman sounded in his Lex Friedman podcast.

How confident Sam Altman sounds doesn’t figure much into my assessment’s of reality, other than the reality of Sam Altman’s promotional skills.

> The real race is to AGI anyways, as whoever gets there first will immediately capture 100% of the market.

AGI has no actual objective definition, and nothing supports this beyond naked conjecture and quasi-religious dogma.

[+] barfbagginus|1 year ago|reply

You would be wrong to accept this hype anyways.

Llama 3 is not beating GPT4 in benchmarks I've seen, and it's not beating it on LLM Arena. That's all that really matters. It needs to beat GPT4 in the benchmarks and leader boards, or it's a nothingburger, as far as OpenAI's dominance goes.

[+] nextworddev|1 year ago|reply

This article will age poorly when GPT-5 is released /s There will be always a market for proprietary models that is the best, though it’s unclear whether that will be OpenAI’s

[+] thorum|1 year ago|reply

GPT-4 and Opus are better for complex/precision tasks, and Haiku is cheaper for everything else.

Good 7B/8B models are still really useful but let’s not be hyperbolic.

[+] unknown|1 year ago|reply

[deleted]

[+] Havoc|1 year ago|reply

Wouldn't say killed, but for certain usage cases definitely provided a strong alternative.

The 8B model seems particularly good at summarization tasks.

[+] mushufasa|1 year ago|reply

The brilliance of Meta's strategy here is: if they offer a (F?)OSS model at near-performance compared to leaders, they commoditize the product of their would-be competitors. Meta doesn't have to make money on API calls, but they could face an existential risk if someone else built the everyday AI companion of the future (e.g. users on ChatGPT UI, microsfoft's much advertised "copilot for everyday").

So -- a defensive play with some positive externalities (e.g. developer ecosystem mindshare + roadmap control, ability to use within their own products at cost, without giving up margin to suppliers).

[+] bagels|1 year ago|reply

It can also power features in Instagram, etc.

[+] drivingmenuts|1 year ago|reply

As long as businesses have proprietary processes, there will be proprietary models. An open-source AI cannot know everything.

[+] colinng|1 year ago|reply

I tested Llama-3-70b-8192 on Groq against ChatGPT 4, and while Groq ran it super fast, it hallucinated one answer, and didn’t get the logic correct on another question.

So, ChatGPT 4 is still more reliable for my use case. But if I were to want an LLM to process data, summarize, and so forth, Llama-3 on Groq is very fast.

Questions:

Do you know anything about Intel Hala Point?

Groq: bullshit, but admitted it when I called it out. ChatGPT: did a Bing search (it knew what it didn’t know).

Question 2a (separate chat): If you’re in Canada, what’s the best way to use a TFSA?

2b: Okay, if your portfolio has some tech stocks, some cash cows, and some government bonds, which should be allocated to the TFSA?

The reason I chose Question 2 is that most banks are happy to recommend bad products if it benefits them. Llama-3’s answer reflects the bank bullshit. ChatGPT 4 gives the advice your trustworthy and financially savvy friend would give you.

Follow-on questions for Llama-3:

2c: You have it backwards.

2d: Why did you get it backwards? Were you influenced by the glut of “advice” proffered by banks?

[+] ladzoppelin|1 year ago|reply

So they are doing the "bait and switch" like Google did for market share and in the end Facebook controls the "best" model? This sounds horrible.

[+] rrr_oh_man|1 year ago|reply

«Any headline that ends in a question mark can be answered by the word no.»

— Betteridge's law of headlines (https://w.wiki/3b$V)

[+] rmellow|1 year ago|reply

Betteridge's corollary:

As an online discussion about a headline that ends in a question mark grows longer, the probability of a citation of Betteridges's law approaches 1.

[+] franze|1 year ago|reply

No

70 comments