top | item 38664239

(no title)

jumpCastle | 2 years ago

Use model output as training data. For better performance you can get some top log probs and minimize kl divergence.

discuss

order

No comments yet.