(no title)
shipilovya | 9 months ago
- GPT-4.1 is best when you need a straight answer - o3 is best for complex cases - Grok is best at clarifying important info (“truthseeking”)
Made this prototype mostly to understand HealthBench deeper. I will probably use it in the future products I make.
No comments yet.