top | item 36057408

(no title)

jemc-dev | 2 years ago

It could be interesting to use this approach in a product that also lets humans pick what they thought was the best answer (in the cases where they are curious about seeing all three).

That data could be gathered internally by that product into an RLHF data set used to train future LLMs.

discuss

order

No comments yet.