(no title)
ijk | 4 months ago
My assumption, based on the research is that training on different prompts but the same answer gives you more robust Q&A behavior; training on variations of how to express the same concept generalizes. Training on the same prompt and different answers gives you creative diversity [2].
[1] https://arxiv.org/abs/2404.00213 [2] https://arxiv.org/abs/2503.17126
No comments yet.