This "vibe" check that it's even better than GPT-4 Turbo is not what its Elo rating shows on the Chatbot Arena based on not 1 but thousands of user votes.
GPT-4 (Turbo) is in a league of its own still.
That depends on what real world use you're targeting, but unfortunately I'm not aware of anything better than that leaderboard in terms of sample size and model coverage.
npinsker|2 years ago
Reubend|2 years ago
ssabev|2 years ago
Racing0461|2 years ago