top | item 46777339

(no title)

bertili | 1 month ago

The "Deepseek moment" is just one year ago today!

Coincidence or not, let's just marvel for a second over this amount of magic/technology that's being given away for free... and how liberating and different this is than OpenAI and others that were closed to "protect us all".

discuss

segmondy|1 month ago

There's been so many moments that folks not really heavy into LLM have missed, DeepSeekR1 was great, but so was all the "incremental" improvements, v3-0324, v3.1, v3.1-terminus, and now v3.2-speciale. With that this is the 3rd great Kimi model, then GLM has been awesome, since 4.5, with 4.5, 4.5-air, 4.6, 4.7 and now 4.7 flash. Minimax-M2 has also been making waves lately. ... and i'm just talking about the Chinese model without adding the 10+ Qwen models. Outside of Chinese models, mistral-small/devstral, gemma-27b-it, gpt-oss-120b, seed-os have been great, and I'm still talking about just LLM, not image, audio or special domain models like deepseek-prover and deepseek-math. It's really a marvel what we have at home. I cancelled OpenAI and Anthropic subscription 2 years ago once they started calling for regulation of open models and I haven't missed them one bit.

igravious|1 month ago

What's your hardware/software setup?

jimmydoe|1 month ago

It’s not coincidence. Chinese companies tend to do big releases before Chinese new year. So expect more to come before Feb 17.

motoboi|1 month ago

What amazes me is why would someone spend millions to train this model and give it away for free. What is the business here?

whizzter|1 month ago

Chinese state that maybe sees open collaboration as the way to nullify any US lead in the field, concurrently if the next "search-winner" is built upon their model the Chinese worldview that Taiwan belongs to China and Tiamen Square massacre never happened.

Also their license says that if you have a big product you need to promote them, remember how Google "gave away" site searche widgets and that was perhaps one of the major ways they gained recognition for being the search leader.

OpenAI/NVidia is the Pets.com/Sun of our generation, insane valuations, stupid spend, expensive options, expensive hardware and so on.

Sun hardware bought for 50k USD to run websites in 2000 are less capable than perhaps 5 dollar/month VPS's today?

"Scaling to AGI/ASI" was always a fools errand, best case OpenAI should've squirreled away money to have a solid engineering department that could focus on algorithmic innovations but considering that Antrophic, Google and Chinese firms have caught up or surpassed them it seems they didn't.

Once things blows up, those closed options that had somewhat sane/solid model research that handles things better will be left and a ton of new competitors running modern/cheaper hardware and just using models are building blocks.

Balinares|1 month ago

Speculating: there are two connected businesses here, creating the models, and serving the models. Outside of a few moneyed outliers, no one is going to run this at home. So at worst opening this model allows mid-sized competitors to serve it to customers from their own infra -- which helps Kimi gain mindshare, particularly against the large incumbents who are definitely not going to be serving Kimi and so don't benefit from its openness.

Given the shallowness of moats in the LLM market, optimizing for mindshare would not be the worst move.

tokioyoyo|1 month ago

Moonshot’s (Kimi’s owner) investors are Alibaba/Tencent et al. Chinese market is stupidly competitive, and there’s a general attitude of “household name will take it all”. However getting there requires having a WeChat-esque user base, through one way or another. If it’s paid, there’ll be friction and it won’t work. Plus, it undermines a lot of other companies, which is a win for a lot of people.

ggdG|1 month ago

I think this fits into some "Commoditize The Complement" strategy.

https://gwern.net/complement

deskamess|1 month ago

I think there is a book (Chip War) about how the USSR did not effectively participate in staying at the edge of the semiconductor revolution. And they have suffered for it.

China has decided they are going to participate in the LLM/AGI/etc revolution at any cost. So it is a sunk cost, and the models are just an end product and any revenue is validation and great, but not essential. The cheaper price points keep their models used and relevant. It challenges the other (US, EU) models to innovate and keep ahead to justify their higher valuations (both monthly plan, and investor). Once those advances are made, it can be bought back to their own models. In effect, the currently leading models are running from a second place candidate who never gets tired and eventually does what they do at a lower price point.

culi|1 month ago

All economically transformative technologies have done similar. If it's privatized, it's not gonna be transformative across the industry. The GPS, the internet, touchscreens, AI voice assistants, microchips, LCDs, etc were all publicly funded (or made by Bell Labs which had a state-mandated monopoly that forced them to open up their patents).

The economist Mariana Mazzucato wrote a great book about this called The Entrepreneurial State: Debunking Public vs. Private Sector Myths

overfeed|1 month ago

> What amazes me is why would someone spend millions to train this model and give it away for free. What is the business here?

How many millions did Google spend on Android (acquisition and salaries), only to give it away for free?

Usually, companies do this to break into a monopolized market (or one that's at risk of becoming one), with openness as a sweetener. IBM with Linux to break UNIX-on-big-iron domination, Google with Android vs. iPhone, Sun with OpenSolaris vs. Linux-on-x86.

YetAnotherNick|1 month ago

Hosting the model is cheaper per token, the more batched token you get. So they have big advantage here.

testfrequency|1 month ago

Curious to hear what “OpenAI” thinks the answer to this is

WarmWash|1 month ago

It's another state project funded at the discretion of the party.

If you look at past state projects, profitability wasn't really considered much. They are notorious for a "Money hose until a diamond is found in the mountains of waste"

PlatoIsADisease|1 month ago

I am convinced that was mostly just marketing. No one uses deepseek as far as I can tell. People are not running it locally. People choose GPT/Gemini/Claude/Grok if you are giving your data away anyway.

My biggest source of my conspiracy is that I made a reddit thread asking a question: "Why all the deepseek hype" or something like that. And to this day, I get odd, 'pro deepseek' comments from accounts only used every few months. Its not like this was some highly upvoted topic that is in the 'Top'.

I'd put that deepseek marketing on-par with an Apple marketing campaign.

logicprog|1 month ago

I don't use DeepSeek, but I prefer Kimi and GLM to closed models for most of my work.

mekpro|1 month ago

Except that, In OpenRouter, Deepseek always maintain in Top 10 Ranking. Although I did not use it personally, i believe that their main advantage over other model is price/performance.

catigula|1 month ago

I mean, there are credible safety issues here. A Kimi fine-tune will absolutely be able to help people do cybersecurity related attacks - very good ones.

In a few years, or less, biological attacks and other sorts of attacks will be plausible with the help of these agents.

Chinese companies aren't humanitarian endeavors.

cindyllm|1 month ago

[deleted]