2 months ago|discuss
user: desideratum
102 karma | created 6 years ago
recent submissions
2 months ago|discuss
3 months ago|discuss
Finetuning GPT-OSS with Axolotl
(github.com)
3 pts|6 months ago|discuss
Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training
(huggingface.co)
3 pts|6 months ago|discuss
Training LLMs with GRPO and Interpreter Feedback Using WebAssembly
(huggingface.co)
3 pts|11 months ago|discuss
1 pts|11 months ago|discuss
DeepSeek-V3-0324
(huggingface.co)
5 pts|11 months ago|1 comment
Training Process Reward Models in Axolotl
(axolotlai.substack.com)
2 pts|1 year ago|discuss
1 year ago|discuss
1 year ago|discuss
2 pts|1 year ago|discuss
1 year ago|discuss
(Deep Learning Based) Opportunistic Screening to Improve Statin Rates
(ahajournals.org)
1 pts|1 year ago|discuss
The theory of Proximal Policy Optimisation implementations
(salmanmohammadi.github.io)
1 pts|1 year ago|discuss
5 years ago|discuss
5 years ago|discuss
5 years ago|discuss
5 years ago|discuss
5 years ago|discuss