top | item 46923463 Reinforcement Learning from Human Feedback 133 points| onurkanbkrc | 24 days ago |rlhfbook.com https://arxiv.org/abs/2504.12501 5 comments order hn newest verdverm|24 days ago Last time I saw Nathan say something about the book, he's actively working on the next version and looking for feedback, check his socials leggerss|24 days ago You could say he's also learning from human feedback dang|24 days ago Related. Others?RLHF Book - https://news.ycombinator.com/item?id=42902936 - Feb 2025 (37 comments) klelatti|24 days ago Web version with links, etc:https://rlhfbook.com/ dang|24 days ago Thanks! We've switched to that above from https://arxiv.org/abs/2504.12501, and put the latter in the toptext. iisweetheartii|24 days ago [deleted]
verdverm|24 days ago Last time I saw Nathan say something about the book, he's actively working on the next version and looking for feedback, check his socials leggerss|24 days ago You could say he's also learning from human feedback
dang|24 days ago Related. Others?RLHF Book - https://news.ycombinator.com/item?id=42902936 - Feb 2025 (37 comments)
klelatti|24 days ago Web version with links, etc:https://rlhfbook.com/ dang|24 days ago Thanks! We've switched to that above from https://arxiv.org/abs/2504.12501, and put the latter in the toptext.
dang|24 days ago Thanks! We've switched to that above from https://arxiv.org/abs/2504.12501, and put the latter in the toptext.
verdverm|24 days ago
leggerss|24 days ago
dang|24 days ago
RLHF Book - https://news.ycombinator.com/item?id=42902936 - Feb 2025 (37 comments)
klelatti|24 days ago
https://rlhfbook.com/
dang|24 days ago
iisweetheartii|24 days ago
[deleted]