The outputs are working correctly in terms of formatting, but the answers themselves may be inconsistent. I have experimented with varying the prompt and the answers can change dramatically. I could experiment with lowering temperature, but I just don't think generative models were a good fit for the problem. The appeal is the speed of prototyping and no need for training data, but it honestly didn't take much for my problem: one afternoon and ~1000 samples labeled got me to a good baseline.
No comments yet.