I have been very impressed with the Qwen3 series. I'm still evaluating them, and I generally take LLM benchmarks with a huge grain of salt, but their MoE models in particular seem to offer a lot of bang for the compute. But what makes you so sure they will take the lead?
Deepseek, Qwen, GLM (quite good). All being open and available for local use definitely puts them ahead in that space, which means a lot of the tinkerers and younger people learning to do things like train and fine-tune are getting good with Chinese models and I do think getting in early like that is a great way to gain mindshare in a space. Look at Apple or Microsoft doing everything they could early on to get their machines and software into schools as early as possible.
Isn't this an indication they are already in the lead? They currently have the best model that beats everyone on all quantitative metrics? Are you implying that the US has a better model somewhere?
They aren't in the lead. They are very close behind, but that's not hard given the quantity of freely published papers. They keep proving they can train models competitive with US models, but, only months after the fact. And at least some of the Chinese models were trained via distillation from US models. Probably not at Alibaba but it seems at least some models were.
davidsainez|4 months ago
greggh|3 months ago
ninetyninenine|4 months ago
mike_hearn|4 months ago
aeve890|4 months ago
ninetyninenine|4 months ago