(no title)
thomastay | 2 years ago
It lets you test out two random different chatbots with the same prompt and compare them. Best thing is, your votes are used to rank LLMs on a public leaderboard, which helps AI researchers.
Here's my prompt I was playing with, which basically only Claude 2 and GPT4 answers well:
How many legs do ten platypuses have, if eleven of them are legless? Platypuses have 3 legs. Walk it through step by step
No comments yet.