top | new | best | ask | show | jobs

top | item 45902538

Bitwise Consistent On-Policy Reinforcement Learning with VLLM and TorchTitan

1 points| brrrrrm | 3 months ago |blog.vllm.ai

discuss

order

No comments yet.

powered by hn/api // news.ycombinator.com