Meta AI releases Code Llama 70B

[+] chrishare|2 years ago|reply

Credit where credit is due, Meta has had a fantastic commitment towards open source ML. You love to see it.

[+] joshspankit|2 years ago|reply

Yes but: if the commitment is driven by internal researchers and coders standing firm about making their work open source (a rumour I’ve heard a couple times), most of the credit goes to them.

[+] mvkel|2 years ago|reply

Wasn't LLaMa originally a leak that they were then forced to spin into an open source contribution?

Not to diminish the value of the contribution, but "commitment" is an interesting word choice.

[+] Aissen|2 years ago|reply

Redistributable and free to use weights does not make a model open source (even if it's really nice, with very few people having access to that kind of training power).

[+] satvikpendem|2 years ago|reply

Meta has source available licenses, not open source ones.

[+] simonw|2 years ago|reply

Here's the model on Hugging Face: https://huggingface.co/codellama/CodeLlama-70b-hf

[+] israrkhan|2 years ago|reply

I hope someone will soon post a quantized version that I can run on my macbook pro.

[+] LVB|2 years ago|reply

I'm not very plugged into how to use these models, but I do love and pay for both ChatGPT and GitHub Copilot. How does one take a model like this (or a smaller version) and leverage it in VS Code? There's a dizzying array of GPT wrapper extensions for VS Code, many of which either seem like kind of junk (10 d/ls, no updates in a year), or just lead to another paid plan, at which point I might as well just keep my GH Copilot. Curious what others are doing here for Copilot-esque code completion without Copilot.

[+] turnsout|2 years ago|reply

Given how good some of the smaller code models are (such as Deepseek Coder at 6.7B), I'll be curious to see what this 70B model is capable of!

[+] martingoodson|2 years ago|reply

Baptiste Roziere gave a great talk about Code Llama at our meetup recently: https://m.youtube.com/watch?v=_mhMi-7ONWQ

I highly recommend watching it.

[+] pandominium|2 years ago|reply

Everyone is mentioning using 4090 and a smaller model, but I rarely see an analysis where the energy consumption is used.

I think Copilot is already highly subsidized by Microsoft.

Let's say you use Copilot around 30% of your daily work hours. How much kWh does an opensource 7B or 13B model use then in a month on one 4090?

EDIT:

I think for a 13B at 30% use per day it comes around 30$/no on energy bill.

So probably with a even more smaller but capable model can beat the Copilot monthly subscription.

[+] theLiminator|2 years ago|reply

Curious what's the current SOTA local copilot model? Are there any extensions in vscode that give you a similar experience? I'd love something more powerful than copilot for local use (I have a 4090, so I should be able to run a decent number of models).

[+] sfsylvester|2 years ago|reply

This is a completely fair, but open question. Not to be a typical HN user, but when you say SOTA local, the question is really what benchmarks do you really care about in order to evaluate. Size, operability, complexity, explainability etc.

Working out what copilot models perform best has been a deep exercise for myself and has really made me evaluate my own coding style on what I find important and things I look out for when investigating models and evaluating interview candidates.

I think three benchmarks & leaderboards most go to are:

https://huggingface.co/spaces/bigcode/bigcode-models-leaderb... - which is the most understood, broad language capability leaderboad that relies on well understood evaluations and benchmarks.

https://huggingface.co/spaces/mike-ravkine/can-ai-code-resul... - Also comprehensive, but primarily assesses Python and JavaScript.

https://evalplus.github.io/leaderboard.html - which I think is a better take on comparing models you intend to run locally as you can evaluate performance, operability and size in one visualisation.

Best of luck and I would love to know which models & benchmarks you choose and why.

[+] Eisenstein|2 years ago|reply

When this 70b model gets quantized you should be able to run it fine on your 4090. Check out 'TheBloke' on huggingface and the llamacpp to run the gguf files.

[+] siilats|2 years ago|reply

We made a Jetbrains plugin called CodeGPT to run this locally https://plugins.jetbrains.com/plugin/21056-codegpt

[+] fullspectrumdev|2 years ago|reply

This looks potentially interesting if it can be ran locally on say, an M2 Max or similar - and if there’s an IDE plugin to do the Copilot thing.

Anything that saves me time writing “boilerplate” or figuring out the boring problems on projects is welcome - so I can expend the organic compute cycles on solving the more difficult software engineering tasks :)

[+] Havoc|2 years ago|reply

Not sure who this is aimed at? The avg programmer probably doesn’t have the gear on hand to run this at the required pace

Cool nonetheless

[+] svara|2 years ago|reply

It's aimed at OpenAI's moat. Making sure they don't accumulate too much of one. No one actually has to use this, it just needs to be clear that LLM as a service won't be super high margin because competition can simply start building on Meta's open source releases.

[+] connorgutman|2 years ago|reply

This is targeted towards GPU rental services like RunPod as well as API providers such as together AI. Together.ai is charging $0.90/1M tokens at 70B parameters. https://www.together.ai/pricing

[+] moyix|2 years ago|reply

You can run it on a Macbook M1/M2 with 64GB of RAM.

[+] ttul|2 years ago|reply

Yeah, but if your company wants to rent an H100, you can deploy this for your developers for much less than the cost of a developer…

[+] kungfupawnda|2 years ago|reply

I got it to build and run the example app on my M3 max with 36 gb ram. Memory pressure was around 32 gb

[+] blackoil|2 years ago|reply

How feasible would it be too fine tune using internal code and have an enterprise copilot.

[+] dimask|2 years ago|reply

There are companies like phind that offer copilot-like services using finetuned versions of CodeLlama-34B, which imo are actually good. But I do not know if such a larger model is gonna be used in such a context.

[+] oceanplexian|2 years ago|reply

I have a multi-GPU rig designed exactly for this purpose :) Check out r/localllama. There are literally dozens of us!

[+] Spivak|2 years ago|reply

People who want to host the models presumably, AWS bedrock will def include it.

[+] bk146|2 years ago|reply

Can someone explain Meta's strategy with the open source models here? Genuine question, I don't fully undestand.

(Please don't say "commoditize your complement" without explaining what exactly they're commoditizing...)

[+] pchristensen|2 years ago|reply

Meta doesn't have an AI "product" competing with OpenAI, Google's Bard, etc. But they use AI extensively internally. This is roughly a byproduct of their internal AI work that they're already doing, and fostering open source AI development puts incredible pressure on the AI products and their owners.

If Meta can help prevent there from being an AI monopoly company, but rather an ecosystem of comparable products, then they avoid having another threatening tech giant competitor, as well as preventing their own AI work and products from being devalued.

Think of it like Google releasing a web browser.

[+] gen220|2 years ago|reply

They're commoditizing the ability to generate viral content, which is the carrot that keeps peoples' eyeballs on the hedonic treadmill. More eyeball-time = more ad placements = more money.

On the advertiser side, they're commoditizing the ability for companies to write more persuasively-targeted ads. Higher click-through rates = more money.

[edit]: For models that generate code instead of content (TFA), it's obviously a different story. I don't have a good grip on that story, beyond "they're using their otherwise-idle GPU farms to buy goodwill and innovate on training methods".

[+] simonw|2 years ago|reply

AI seems like the Next Big Thing. Meta have put themselves at the center of the most exciting growth area in technology by releasing models they have trained.

They've gained an incredible amount of influence and mindshare.

[+] eurekin|2 years ago|reply

Total speculation: Yann LeCun is there and he is really passionate about the technology and openness

[+] Lerc|2 years ago|reply

If they hadn't opened the models the llama series would just be a few sub-GPT4 models. Opening the models has created a wealth of development that has built upon those models.

Alone, it was unlikely they would become a major player in a field that might be massively important. With a large community building upon their base they have a chance to influence the direction of development and possibly prevent a proprietary monopoly in the hands of another company.

[+] chasd00|2 years ago|reply

My opinion is Meta is taking the model out of the secret sauce formula. That leaves hardware and data for training as the barrier to entry. If you don't need to develop your own model then all you need is data and hardware which lowers the barrier to entry. The lower the barrier the more GenAI startups and the more potential data customers for Meta since they certainly have large, curated, datasets for sale.

[+] andy99|2 years ago|reply

I think a big part of it is just because they have a big AI lab. I don't know the genesis of that, but it has for years been a big contributor, see pytorch, models like SEER, as well as being one of the dominant publishers at big conferences.

Maybe now their leadership wants to push for practicality so they don't end up like Google (also a research powerhouse but failing to convert to popular advances) so they are publicly pushing strong LLMs.

[+] crowcroft|2 years ago|reply

Meta's end goal is to have better AI than everyone else, in the medium term that means they want to have the best foundational models. How does this help.

1. They become an attractive place for AI researchers to work, and can bring in better staff. 2. They make it less appealing for startups to enter the space and build large foundation models (Meta would prefer 1,000 startups pop up and play around with other people's models, than 1000 startups popping up and trying to build better foundational models). 3. They put cost pressure on AI as a service providers. When LLAMA exists it's harder for companies to make a profit just selling access to models. Along with 2 this further limits the possibility of startups entering the foundational model space, because the path to monetization/breakeven is more difficult.

Essentially this puts Meta, Google, and OpenAI/Microsoft (Anthropic/Amazon as a number four maybe) as the only real players in the cutting edge foundational model space. Worst case scenario they maintain their place in the current tech hegemony as newcomers are blocked from competing.

[+] Philpax|2 years ago|reply

Aside from the "positive" explanations offered in the sibling comments, there's also a "negative" one: other AI companies that try to enter the fray will not be able to compete with Meta's open offerings. After all, why would you pay a company to undertake R&D on building their own models when you can just finetune a Llama?

[+] blitzar|2 years ago|reply

Facebook went all in on the metaverse and turned into Meta; quite rightly, the market looked at what they produced for 10's of billions and decided their company was worthless.

Then Ai sprung to the front pages and any CEO who stood up and said "Ai" was rewarded with a 10x stock price. The unloved stepchild that was the ML team became the A team and the metaverse team have been sent to the naughty step. Facebook/Meta have no actual customer facing use for Ai unlike Microsoft/Google/GitHub but they like a good stonk price rise and so what we see is their stategy to stay in the ai game and relevant.

It turns out it is pretty good for the rest of us (possibly the first time facebook has given something positive to humanity) as we get shinny toys to play with.

[+] bryan_w|2 years ago|reply

Part of it is that they already had this developed for years (see alt text on uploaded images for example), and they want to ensure that new regulations don't hamper any of their future plans.

It costs them nothing to open it up, so why not. Kinda like all the rest of their GitHub repos.

[+] Too|2 years ago|reply

Meta still sit on all the juicy user data that they want to use AI on but they don’t know how. They are crowdsourcing development of applications and tooling.

Meta releases model. Joe builds a cool app with it, earns some internet points and if lucky a few hundred bucks. Meta copies app, multiply Joes success story with 1 billion users and earn a few million bucks.

Joe is happy, Meta is happy. Everybody is happy.

[+] Calvin02|2 years ago|reply

Controversial take:

Meta sees this as the way to improve their AI offerings faster than others and, eventually, better than others.

Instead of a small group of engineers working on this inside Meta, the Open Source community helps improve it.

They have a history of this with React, PyTorch, hhvm, etc. All these have gotten better as OS projects faster than Meta alone would have been able to do.

[+] theGnuMe|2 years ago|reply

Bill Gurly has a good perspective on it.

Essentially, you mitigate IP claims and reduce vendor dependency.

https://eightify.app/summary/technology-and-software/the-imp...

[+] sidcool|2 years ago|reply

I am trying this on perplexity labs. How does it work so fast?

[+] d_sc|2 years ago|reply

Any good resources or suggestions for system/pre-prompt for general coding or when targeting a specific language? ie, using the CodeLlama and working on typescript, ruby, rust, elixir etc.. is there a universal prompt that gives good results or would you want to adjust the prompt depending on the language you're targeting?

[+] unknown|2 years ago|reply

[deleted]

[+] anonymousDan|2 years ago|reply

Can anyone tell me what kind of hardware setup would be needed to fine-tune something like this? Would you need a cluster of GPUs? What kind of size + GPU spec would you think is reasonable (e.g. wrt VRAM per GPU etc).

[+] d_sc|2 years ago|reply

Hope the M3 Ultra Studio comes along soon. Would be great to get a cpu/gpu of that level + the ram capacity it would likely be able to handle.

[+] ahmednazir|2 years ago|reply

Can you explain why big tech company make a race to release an open source model? If model is free and open source then how will they earn and how will they compete with others?

294 comments