I always thought that the point of instruction tuning and ability to use prompts to get the model to do 0 shot tasks was that you don't have to collect tons of example data/samples. The method proposed here requires you to have tons of data. If you have that, why not just fine tune the underlying model?
No comments yet.