It's not that they trained a new model, but they took an existing model and RL'd it a bit?
The scores are very close to QwQ-32B, and at the end:
"Overall, as QwQ-32B was already extensively trained with RL, it was difficult to obtain huge amounts of generalized improvement on benchmarks beyond our improvements on the training dataset. To see stronger improvements, it is likely that better base models such as the now available Qwen3, or higher quality datasets and RL environments are needed."
The interesting delta here is that this proves that we can distribute the training and get a functioning model. The scaling factor is way bigger than datacenters
Third party fine tuned open weighted LLMs tend to be good at a handful of benchmarks, but parity or lower on others compared to the original model. There are some exceptions like Nvidia's Nemotron series, but the differences generally are so small as to be imperceptible. Deepseek released finetunes of several Qwen and Llama models alongside R1, and while they were better in some select (mostly math) and coding domains, there were problems resulting from fine tuning that didn't result in them overtaking the original models in usage.
I read an argument, that proof of work needs to be useless and wasteful. If it would produce value in itself it would make 51% attacks more economic and thus the currency less secure.
There's nothing provable here. Crypto proof of work is easily verified (does the hash of this value look the way I expect?). How do you prove in ~O(1) time that someone did some operation with their GPU? You don't. You don't even know what the thing is that you're training (without a trained model you don't have the ability to know whether the model the was allegedly trained learned the thing you want it to learn).
The emphasis is indeed on "without trust" – as far as I can tell this project is unable to verify whether the decentralized training nodes are contributing productively.
Without the ability to validate that training compute is heading in the globally desired direction, it is unlikely you could use it as the foundation of a (sound) cryptocurrency.
There could be merit to this. Proofs are generally computationally hard, so it's possible that a currency could be created by quantifying verification.
> To stop wasting computing resources in crypto currencies and get something useful as a byproduct.
Bitcoin is the only major cryptocurrency that still use proof of work today (others are either using “proof of stakes” or are “Layer 2” chains), and due to its (relative lack of) governance structure, it's very unlikely to ever change.
This is rather exciting! I see the future of Co-op models made by a community of experts on a specific field that would still allow them to be competitive with "AI monopolies". Maybe not all hope is lost!
I used to have an idea related to science fiction novels that artificial intelligence could aggregate computing power through the network to perform ultra-large-scale calculations, thereby achieving strong artificial intelligence.
Reality will also develop in this way, which is very interesting
The most interesting thing I see is the productization of the diloco work done here [1]. If someone can make this scale, then we can say goodbye to expensive backend networking and mainframe-like AI training machinery.
I wonder why they randomly noted a torch-compile vs non torch-compile figure where torch-compile degraded model performance. What made it degrade? It seems to only appear in one figure and nowhere else.
Personal story time: I met a couple of their engineers at an event a few months back. They mentioned they were building a distributed training system for LLMs.
I asked them how they were building it and they mentioned Python. I said something along the lines of “not to be the typical internet commenter guy, but why aren’t you using something like Rust for the distributed system parts?”
They mumbled something about Python as the base for all current LLMs, and then kinda just walked away…
From their article:
> “Rust-based orchestrator and discovery service coordinate permissionless workers”
The technical underpinning has nothing to do with the language. It is a different way of optimizing parameters called diloco. I agree though that python is an abomination for systems services componentry when there are languages like rust.
[+] [-] throwanem|10 months ago|reply
[+] [-] Extropy_|10 months ago|reply
[+] [-] bcoates|10 months ago|reply
[+] [-] refulgentis|10 months ago|reply
It's not that they trained a new model, but they took an existing model and RL'd it a bit?
The scores are very close to QwQ-32B, and at the end:
"Overall, as QwQ-32B was already extensively trained with RL, it was difficult to obtain huge amounts of generalized improvement on benchmarks beyond our improvements on the training dataset. To see stronger improvements, it is likely that better base models such as the now available Qwen3, or higher quality datasets and RL environments are needed."
[+] [-] fabmilo|10 months ago|reply
[+] [-] christianqchung|10 months ago|reply
[+] [-] cess11|10 months ago|reply
[+] [-] iTokio|10 months ago|reply
Maybe this could be used as proof of work? To stop wasting computing resources in crypto currencies and get something useful as a byproduct.
[+] [-] _ink_|10 months ago|reply
[+] [-] Geee|10 months ago|reply
[+] [-] bastawhiz|10 months ago|reply
There's nothing provable here. Crypto proof of work is easily verified (does the hash of this value look the way I expect?). How do you prove in ~O(1) time that someone did some operation with their GPU? You don't. You don't even know what the thing is that you're training (without a trained model you don't have the ability to know whether the model the was allegedly trained learned the thing you want it to learn).
[+] [-] fastball|10 months ago|reply
Without the ability to validate that training compute is heading in the globally desired direction, it is unlikely you could use it as the foundation of a (sound) cryptocurrency.
[+] [-] proof_by_vibes|10 months ago|reply
[+] [-] littlestymaar|10 months ago|reply
Bitcoin is the only major cryptocurrency that still use proof of work today (others are either using “proof of stakes” or are “Layer 2” chains), and due to its (relative lack of) governance structure, it's very unlikely to ever change.
[+] [-] mentalgear|10 months ago|reply
[+] [-] k__|10 months ago|reply
[+] [-] 3abiton|10 months ago|reply
[+] [-] Thomashuet|10 months ago|reply
[+] [-] Weryj|10 months ago|reply
[+] [-] unknown|10 months ago|reply
[deleted]
[+] [-] danielhanchen|10 months ago|reply
./llama.cpp/llama-cli -hf unsloth/INTELLECT-2-GGUF:Q4_K_XL -ngl 99
Also it's best to read https://docs.unsloth.ai/basics/tutorial-how-to-run-qwq-32b-e... on sampling issues for QwQ based models.
Or TLDR, use the below settings:
./llama.cpp/llama-cli -hf unsloth/INTELLECT-2-GGUF:Q4_K_XL -ngl 99 --temp 0.6 --repeat-penalty 1.1 --dry-multiplier 0.5 --min-p 0.00 --top-k 40 --top-p 0.95 --samplers "top_k;top_p;min_p;temperature;dry;typ_p;xtc"
[+] [-] abtinf|10 months ago|reply
[+] [-] arthurcolle|10 months ago|reply
[+] [-] esafak|10 months ago|reply
[+] [-] nsingh2|10 months ago|reply
> based on top of novel components such as TOPLOC, which verifies rollouts from untrusted inference workers
https://github.com/PrimeIntellect-ai/toploc
[+] [-] schneehertz|10 months ago|reply
[+] [-] mountainriver|10 months ago|reply
[+] [-] unknown|10 months ago|reply
[deleted]
[+] [-] quantumwoke|10 months ago|reply
[+] [-] bjt12345|10 months ago|reply
[+] [-] bwfan123|10 months ago|reply
[1] https://arxiv.org/abs/2311.08105
[+] [-] ikeashark|10 months ago|reply
[+] [-] unknown|10 months ago|reply
[deleted]
[+] [-] ndgold|10 months ago|reply
[+] [-] Mougatine|10 months ago|reply
[+] [-] jumploops|10 months ago|reply
Personal story time: I met a couple of their engineers at an event a few months back. They mentioned they were building a distributed training system for LLMs.
I asked them how they were building it and they mentioned Python. I said something along the lines of “not to be the typical internet commenter guy, but why aren’t you using something like Rust for the distributed system parts?”
They mumbled something about Python as the base for all current LLMs, and then kinda just walked away…
From their article: > “Rust-based orchestrator and discovery service coordinate permissionless workers”
Glad to see that I wasn’t entirely off-base :)
[+] [-] bwfan123|10 months ago|reply
[+] [-] Havoc|10 months ago|reply
[+] [-] unknown|10 months ago|reply
[deleted]