Mistral Large | WingNews

[+] speedgoose|2 years ago|reply

I appreciate the honesty in the marketing materials. Showing the product scoring below the market leader in a big benchmark is better than the Google way of cherry picking benchmarks.

[+] onlyrealcuzzo|2 years ago|reply

They compare to Gemini Pro 1.0...

Seems intentionally misleading.

[+] autokad|2 years ago|reply

it sounds like they are trying to be clear they aren't stepping on chatgpt's (openai) toes

edit: not sure why I am being downvoted. I am 100% sure the way they structured it was meant to say "we are doing great, but not as great as openAI's work, which we are not trying to compete against". I guarantee there were discussions on how to make it look as to not appear that way.

[+] IncreasePosts|2 years ago|reply

[deleted]

[+] d-z-m|2 years ago|reply

Very nice! I know they've already done a lot, but I would've liked some language in there re-affirming a commitment to contributing to the open source community. I had thought that was a major part of their brand.

I've been staying tuned[0] since the miqu[1] debacle thinking that more open weights were on the horizon. I guess we'll just have to wait and see.

[0]: https://twitter.com/arthurmensch/status/1752737462663684344 [1]: https://huggingface.co/miqudev/miqu-1-70b/discussions/10

[+] ambigious7777|2 years ago|reply

I feel like that Mistral is removing the commitment to OSS from their branding, and their company culture in general.

[+] WiSaGaN|2 years ago|reply

Changelog is also updated: [1]

Feb. 26, 2024

API endpoints: We renamed 3 API endpoints and added 2 model endpoints.

open-mistral-7b (aka mistral-tiny-2312): renamed from mistral-tiny. The endpoint mistral-tiny will be deprecated in three months.

open-mixtral-8x7B (aka mistral-small-2312): renamed from mistral-small. The endpoint mistral-small will be deprecated in three months.

mistral-small-latest (aka mistral-small-2402): new model.

mistral-medium-latest (aka mistral-medium-2312): old model. The previous mistral-medium has been dated and tagged as mistral-medium-2312. The endpoint mistral-medium will be deprecated in three months.

mistral-large-latest (aka mistral-large-2402): our new flagship model with leading performance.

New API capabilities:

Function calling: available for Mistral Small and Mistral Large. JSON mode: available for Mistral Small and Mistral Large

La Plateforme:

We added multiple currency support to the payment system, including the option to pay in US dollars. We introduced enterprise platform features including admin management, which allows users to manage individuals from your organization.

Le Chat:

We introduced the brand new chat interface Le Chat to easily interact with Mistral models.

You can currently interact with three models: Mistral Large, Mistral Next, and Mistral Small.

[1]: https://docs.mistral.ai/platform/changelog/

[+] simonw|2 years ago|reply

I just added support for the new models to my https://github.com/simonw/llm-mistral plugin for my LLM CLI tool. You can now do this:

    pipx install llm
    llm install llm-mistral
    llm keys set mistral
    < paste your API key here >
    llm -m mistral-large 'prompt goes here'

[+] ComputerGuru|2 years ago|reply

does `llm install llm-mistral` also upgrade if already installed?

[+] city17|2 years ago|reply

Just tried Le Chat for some coding issues I had today that ChatGPT (with GPT-4) wasn't able to solve, and Le Chat actually gave way better answers. Not sure if ChatGPT quality has gone down to save costs as some people suggest, but for these few problems the quality of the answers was significantly better for Mistral.

[+] qwertox|2 years ago|reply

I just did a 1:1 copy of some of my ChatGPT chats with Mistral Large (always posting the same questions), and while it is really, really good, it's still not as good as GPT4.

I feel like ChatGPT has a better way of figuring out what I want to know and provides better examples.

I also preferred GPT4's code.

Then Le Chat has some usability issues, like a too thin font and a too high contrast in dark mode.

But overall, I could live with it should ChatGPT go offline.

[+] lobocinza|2 years ago|reply

I might as well be hallucinating but my personal experience is that GPT-4 got sucessively worse than what it was at launch date at least for general things. Nowadays it just refuse to answer a lot of things and lost the ability to do holistic "reasoning" (bridging knowledge from different areas).

[+] syntaxing|2 years ago|reply

Interesting, I didn’t know they had le chat. I’ve been wanting a chatgpt competitor with mistral. Also love the fact they put “le” in front of their products

[+] loudmax|2 years ago|reply

Cute, but "le chat" literally means "the cat".

I presume most young Francophones who are likely to actually use Mistral will pronounce it in Franglais as "le tchatte".

[+] moffkalast|2 years ago|reply

Reminds me of the rage comics of old.

[+] ot|2 years ago|reply

"Big Mac's a Big Mac, but they call it Le Big Mac"

[+] breakingcups|2 years ago|reply

So, all this hubbub about open weights is already over? It will remain closed?

[+] binarymax|2 years ago|reply

On Azure, it's slightly cheaper than GPT-4.

Per 1000 tokens:

    GPT-4   input:  $0.01
    Mistral input:  $0.008
    GPT-4   output: $0.03
    Mistral output: $0.024

[+] whazor|2 years ago|reply

But there is also GPT-4 turbo

[+] bicepjai|2 years ago|reply

Why is there no information about size of parameters anywhere ? Was that intentional or am I missing something

[+] Larok00|2 years ago|reply

There is not a lot of advantage to releasing this on Azure where you are directly competing with GPT-4, which will beat you on most tasks.

[+] hobofan|2 years ago|reply

I would assume that the advantage (for Mistal) here is Microsoft paying them money to be the exclusive model hosting partner, so that everyone has to go to Azure to get top-tier hosted models.

[+] btbuildem|2 years ago|reply

Au contraire, I think in the eyes of beige khaki corpo bureaucrats this gives Mixtral legitimacy and puts it on par with OpenAI offerings. MS putting their Azure stamp on this means it's Safe and Secure (tm).

It makes even more sense from MS perspective -- now they can offer two competing models on their own infra, becoming the defacto shop for large corporate LLM clients.

[+] chadash|2 years ago|reply

Say that you are building a b2b product that uses LLMs for whatever. A common question that users will ask is if their data is safe and who else has access. Everyone is afraid or AI training on their data. Saying that Microsoft is the only one that touches your customer’s data is an important part of your sales pitch. No one outside of tech knows who mistral is.

[+] CuriouslyC|2 years ago|reply

Once a LLM is "good enough" the metric people care about is cost/token, which is never going to be in GPT4's favor.

[+] btbuildem|2 years ago|reply

I've been using Mistral over GPT lately, because it refuses my requests far less often.

[+] unknown|2 years ago|reply

[deleted]

[+] tomkaos|2 years ago|reply

You probably get a better control on what a happen with the data. I have the impression that GTP-4 is more of a black box and privacy problem.

[+] neutralino1|2 years ago|reply

Price is the advantage.

[+] raverbashing|2 years ago|reply

Depends on pricing

[+] randall|2 years ago|reply

Wow this is like if multiple interchangeable cpu architectures existed or something. Every time a new llm gets released I’m so excited about how much better things will be with so many fewer monopolies.

Even without an open source model I think open AI has already achieved its mission.

[+] ionwake|2 years ago|reply

Im not sure if anyone cares about my opinion, but I think its worth mentioning that of all the models, Mixtral is IMO the best, and I do not know what Id do without it.

Fantastic news, thank you.

[+] Agentlien|2 years ago|reply

I've tried a bunch of models both online and offline and mixtral is the first one which avtively has me reaching for it instead of Google when I'm wondering about something. I also love how well it works locally with ollama.

I still sometimes need to double-check its answers and be critical of its responses. But when I want to confirm the answer I suspect, or know the gist of it but want more details, I find it invaluable.

It seems especially really strong in areas of science and computing. However, it consistently gives plausible but incorrect information when asked about Swedish art and culture. Though it does speak really good Swedish!

[+] manishsharan|2 years ago|reply

Would you feel comfortable sharing your use case ? Also what make Mistral a better fit for your use ? Is it finetuning cost, operational cost, response times etc. ?

I do not have an opportunity to explore these models in my job; hence my curiosity.

[+] martinesko36|2 years ago|reply

Doesn't look like it's open source/weights?

[+] pama|2 years ago|reply

It is very nice to see the possibility of self deployment. Does anyone have experience with self deployment of such a large model in a company setting?

[+] YetAnotherNick|2 years ago|reply

It's a really tough sell. They are charging 80% of GPT 4, and are below in the benchmark. I will only use overall best model or the best open weights model or the cheapest which could do the task. And it's none of the three in almost any scenario.

[+] Havoc|2 years ago|reply

That’s a sure way to end up with a global monopoly and no competitive open models. Things like mixtral on open side rely on companies like mistral existing.

[+] RohMin|2 years ago|reply

I haven't been able to get a great answer regarding why OpenAI is consistently leading the pack. What could they possibly be doing different? I can't imagine they've invented a technique that nobody else can reach at this point

[+] autokad|2 years ago|reply

my guess is openai spent the most human hours fine tuning the model, and other companies are running into problems and trying to deal with them whereas openai already learned those lessons a long time ago

[+] BryanLegend|2 years ago|reply

There's a network effect in that they are used more so they've generated more feedback from users, which is then used to improve GPT.

[+] simonw|2 years ago|reply

Feature request for Mistral API maintainers: the https://api.mistral.ai/v1/models API endpoint returns all of the language models and mistral-embed as well, but there's currently nothing in the JSON to help distinguish that embedding models from the others: https://github.com/simonw/llm-mistral/issues/5#issuecomment-...

It would be useful if there was an indication of which models are embedding models.

[+] thorum|2 years ago|reply

Full benchmarks vs other Mistral models:

https://docs.mistral.ai/platform/endpoints/#benchmarks-resul...

[+] convexstrictly|2 years ago|reply

Pricing

input: $8/1M tokens

output: $24/1M tokens

https://docs.mistral.ai/platform/pricing/

[+] o_____________o|2 years ago|reply

Compared to GPT4, which is $10/$30 for turbo and $30/$60 for the flagship

https://openai.com/pricing

[+] unknown|2 years ago|reply

[deleted]

[+] unknown|2 years ago|reply

[deleted]

[+] rpozarickij|2 years ago|reply

I can't stop finding such intense competition between the world's top experts in a single area truly fascinating.

I wonder whether witnessing the space race felt similar. It's just that now we have more players and the effort is much more decentralized.

And maybe the amount of resources used is comparable too..

[+] Nevermark|2 years ago|reply

some startups are going to achieve trillion dollar market caps this decade I expect.

The resources used are going to be incomparable to anything before.

And ten trillion next decade I predict. General intelligence is the “last” technology we will ever need, in the sense that it will subsume all other technological progress.

267 comments