I appreciate the honesty in the marketing materials. Showing the product scoring below the market leader in a big benchmark is better than the Google way of cherry picking benchmarks.
it sounds like they are trying to be clear they aren't stepping on chatgpt's (openai) toes
edit: not sure why I am being downvoted. I am 100% sure the way they structured it was meant to say "we are doing great, but not as great as openAI's work, which we are not trying to compete against". I guarantee there were discussions on how to make it look as to not appear that way.
Very nice! I know they've already done a lot, but I would've liked some language in there re-affirming a commitment to contributing to the open source community. I had thought that was a major part of their brand.
I've been staying tuned[0] since the miqu[1] debacle thinking that more open weights were on the horizon. I guess we'll just have to wait and see.
API endpoints: We renamed 3 API endpoints and added 2 model endpoints.
open-mistral-7b (aka mistral-tiny-2312): renamed from mistral-tiny. The endpoint mistral-tiny will be deprecated in three months.
open-mixtral-8x7B (aka mistral-small-2312): renamed from mistral-small. The endpoint mistral-small will be deprecated in three months.
mistral-small-latest (aka mistral-small-2402): new model.
mistral-medium-latest (aka mistral-medium-2312): old model. The previous mistral-medium has been dated and tagged as mistral-medium-2312. The endpoint mistral-medium will be deprecated in three months.
mistral-large-latest (aka mistral-large-2402): our new flagship model with leading performance.
New API capabilities:
Function calling: available for Mistral Small and Mistral Large.
JSON mode: available for Mistral Small and Mistral Large
La Plateforme:
We added multiple currency support to the payment system, including the option to pay in US dollars.
We introduced enterprise platform features including admin management, which allows users to manage individuals from your organization.
Le Chat:
We introduced the brand new chat interface Le Chat to easily interact with Mistral models.
You can currently interact with three models: Mistral Large, Mistral Next, and Mistral Small.
Just tried Le Chat for some coding issues I had today that ChatGPT (with GPT-4) wasn't able to solve, and Le Chat actually gave way better answers. Not sure if ChatGPT quality has gone down to save costs as some people suggest, but for these few problems the quality of the answers was significantly better for Mistral.
I just did a 1:1 copy of some of my ChatGPT chats with Mistral Large (always posting the same questions), and while it is really, really good, it's still not as good as GPT4.
I feel like ChatGPT has a better way of figuring out what I want to know and provides better examples.
I also preferred GPT4's code.
Then Le Chat has some usability issues, like a too thin font and a too high contrast in dark mode.
But overall, I could live with it should ChatGPT go offline.
I might as well be hallucinating but my personal experience is that GPT-4 got sucessively worse than what it was at launch date at least for general things. Nowadays it just refuse to answer a lot of things and lost the ability to do holistic "reasoning" (bridging knowledge from different areas).
Interesting, I didn’t know they had le chat. I’ve been wanting a chatgpt competitor with mistral. Also love the fact they put “le” in front of their products
I would assume that the advantage (for Mistal) here is Microsoft paying them money to be the exclusive model hosting partner, so that everyone has to go to Azure to get top-tier hosted models.
Au contraire, I think in the eyes of beige khaki corpo bureaucrats this gives Mixtral legitimacy and puts it on par with OpenAI offerings. MS putting their Azure stamp on this means it's Safe and Secure (tm).
It makes even more sense from MS perspective -- now they can offer two competing models on their own infra, becoming the defacto shop for large corporate LLM clients.
Say that you are building a b2b product that uses LLMs for whatever. A common question that users will ask is if their data is safe and who else has access. Everyone is afraid or AI training on their data. Saying that Microsoft is the only one that touches your customer’s data is an important part of your sales pitch. No one outside of tech knows who mistral is.
Wow this is like if multiple interchangeable cpu architectures existed or something. Every time a new llm gets released I’m so excited about how much better things will be with so many fewer monopolies.
Even without an open source model I think open AI has already achieved its mission.
Im not sure if anyone cares about my opinion, but I think its worth mentioning that of all the models, Mixtral is IMO the best, and I do not know what Id do without it.
I've tried a bunch of models both online and offline and mixtral is the first one which avtively has me reaching for it instead of Google when I'm wondering about something. I also love how well it works locally with ollama.
I still sometimes need to double-check its answers and be critical of its responses. But when I want to confirm the answer I suspect, or know the gist of it but want more details, I find it invaluable.
It seems especially really strong in areas of science and computing. However, it consistently gives plausible but incorrect information when asked about Swedish art and culture. Though it does speak really good Swedish!
Would you feel comfortable sharing your use case ? Also what make Mistral a better fit for your use ? Is it finetuning cost, operational cost, response times etc. ?
I do not have an opportunity to explore these models in my job; hence my curiosity.
It is very nice to see the possibility of self deployment. Does anyone have experience with self deployment of such a large model in a company setting?
It's a really tough sell. They are charging 80% of GPT 4, and are below in the benchmark. I will only use overall best model or the best open weights model or the cheapest which could do the task. And it's none of the three in almost any scenario.
That’s a sure way to end up with a global monopoly and no competitive open models. Things like mixtral on open side rely on companies like mistral existing.
I haven't been able to get a great answer regarding why OpenAI is consistently leading the pack. What could they possibly be doing different? I can't imagine they've invented a technique that nobody else can reach at this point
my guess is openai spent the most human hours fine tuning the model, and other companies are running into problems and trying to deal with them whereas openai already learned those lessons a long time ago
some startups are going to achieve trillion dollar market caps this decade I expect.
The resources used are going to be incomparable to anything before.
And ten trillion next decade I predict. General intelligence is the “last” technology we will ever need, in the sense that it will subsume all other technological progress.
[+] [-] speedgoose|2 years ago|reply
[+] [-] onlyrealcuzzo|2 years ago|reply
Seems intentionally misleading.
[+] [-] autokad|2 years ago|reply
edit: not sure why I am being downvoted. I am 100% sure the way they structured it was meant to say "we are doing great, but not as great as openAI's work, which we are not trying to compete against". I guarantee there were discussions on how to make it look as to not appear that way.
[+] [-] IncreasePosts|2 years ago|reply
[deleted]
[+] [-] d-z-m|2 years ago|reply
I've been staying tuned[0] since the miqu[1] debacle thinking that more open weights were on the horizon. I guess we'll just have to wait and see.
[0]: https://twitter.com/arthurmensch/status/1752737462663684344 [1]: https://huggingface.co/miqudev/miqu-1-70b/discussions/10
[+] [-] ambigious7777|2 years ago|reply
[+] [-] WiSaGaN|2 years ago|reply
Feb. 26, 2024
API endpoints: We renamed 3 API endpoints and added 2 model endpoints.
open-mistral-7b (aka mistral-tiny-2312): renamed from mistral-tiny. The endpoint mistral-tiny will be deprecated in three months.
open-mixtral-8x7B (aka mistral-small-2312): renamed from mistral-small. The endpoint mistral-small will be deprecated in three months.
mistral-small-latest (aka mistral-small-2402): new model.
mistral-medium-latest (aka mistral-medium-2312): old model. The previous mistral-medium has been dated and tagged as mistral-medium-2312. The endpoint mistral-medium will be deprecated in three months.
mistral-large-latest (aka mistral-large-2402): our new flagship model with leading performance.
New API capabilities:
Function calling: available for Mistral Small and Mistral Large. JSON mode: available for Mistral Small and Mistral Large
La Plateforme:
We added multiple currency support to the payment system, including the option to pay in US dollars. We introduced enterprise platform features including admin management, which allows users to manage individuals from your organization.
Le Chat:
We introduced the brand new chat interface Le Chat to easily interact with Mistral models.
You can currently interact with three models: Mistral Large, Mistral Next, and Mistral Small.
[1]: https://docs.mistral.ai/platform/changelog/
[+] [-] simonw|2 years ago|reply
[+] [-] ComputerGuru|2 years ago|reply
[+] [-] city17|2 years ago|reply
[+] [-] qwertox|2 years ago|reply
I feel like ChatGPT has a better way of figuring out what I want to know and provides better examples.
I also preferred GPT4's code.
Then Le Chat has some usability issues, like a too thin font and a too high contrast in dark mode.
But overall, I could live with it should ChatGPT go offline.
[+] [-] lobocinza|2 years ago|reply
[+] [-] syntaxing|2 years ago|reply
[+] [-] loudmax|2 years ago|reply
I presume most young Francophones who are likely to actually use Mistral will pronounce it in Franglais as "le tchatte".
[+] [-] moffkalast|2 years ago|reply
[+] [-] ot|2 years ago|reply
[+] [-] breakingcups|2 years ago|reply
[+] [-] binarymax|2 years ago|reply
Per 1000 tokens:
[+] [-] whazor|2 years ago|reply
[+] [-] bicepjai|2 years ago|reply
[+] [-] Larok00|2 years ago|reply
[+] [-] hobofan|2 years ago|reply
[+] [-] btbuildem|2 years ago|reply
It makes even more sense from MS perspective -- now they can offer two competing models on their own infra, becoming the defacto shop for large corporate LLM clients.
[+] [-] chadash|2 years ago|reply
[+] [-] CuriouslyC|2 years ago|reply
[+] [-] btbuildem|2 years ago|reply
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] tomkaos|2 years ago|reply
[+] [-] neutralino1|2 years ago|reply
[+] [-] raverbashing|2 years ago|reply
[+] [-] randall|2 years ago|reply
Even without an open source model I think open AI has already achieved its mission.
[+] [-] ionwake|2 years ago|reply
Fantastic news, thank you.
[+] [-] Agentlien|2 years ago|reply
I still sometimes need to double-check its answers and be critical of its responses. But when I want to confirm the answer I suspect, or know the gist of it but want more details, I find it invaluable.
It seems especially really strong in areas of science and computing. However, it consistently gives plausible but incorrect information when asked about Swedish art and culture. Though it does speak really good Swedish!
[+] [-] manishsharan|2 years ago|reply
I do not have an opportunity to explore these models in my job; hence my curiosity.
[+] [-] martinesko36|2 years ago|reply
[+] [-] pama|2 years ago|reply
[+] [-] YetAnotherNick|2 years ago|reply
[+] [-] Havoc|2 years ago|reply
[+] [-] RohMin|2 years ago|reply
[+] [-] autokad|2 years ago|reply
[+] [-] BryanLegend|2 years ago|reply
[+] [-] simonw|2 years ago|reply
It would be useful if there was an indication of which models are embedding models.
[+] [-] thorum|2 years ago|reply
https://docs.mistral.ai/platform/endpoints/#benchmarks-resul...
[+] [-] convexstrictly|2 years ago|reply
input: $8/1M tokens
output: $24/1M tokens
https://docs.mistral.ai/platform/pricing/
[+] [-] o_____________o|2 years ago|reply
https://openai.com/pricing
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] rpozarickij|2 years ago|reply
I wonder whether witnessing the space race felt similar. It's just that now we have more players and the effort is much more decentralized.
And maybe the amount of resources used is comparable too..
[+] [-] Nevermark|2 years ago|reply
The resources used are going to be incomparable to anything before.
And ten trillion next decade I predict. General intelligence is the “last” technology we will ever need, in the sense that it will subsume all other technological progress.