I think they are best at information extraction/classification tasks, especially for complex tasks with little to no training data, and data synthesis tasks. However, you should always test if simpler models can already perform the task reasonably well to save money.
They underperform at anything that requires reasoning.
No comments yet.