GPT-4 is a fine-tuned model (likely first fine-tuned for code, then for chat on top of that like gpt-3.5-turbo was[0]), while PaLM2 as reported is a foundational model without any additional fine-tuning applied yet. I would expect its performance to improve on this if it were fine-tuned, though I don't have a great sense of what the cap would be.[0] https://platform.openai.com/docs/model-index-for-researchers
macrolime|2 years ago