top | item 38664239 (no title) jumpCastle | 2 years ago Use model output as training data. For better performance you can get some top log probs and minimize kl divergence. discuss order hn newest No comments yet.
No comments yet.