Typically in these tests you have three options "A is better", "B is better" or "they're equal/can't decide". So if 56% prefer O3 Mini, it's likely that way less than half prefer O1.also, the way I understand it, they're comparing a mini model with a large one.
directevolve|1 year ago
ignoramous|1 year ago
Does no one else hate it when this happens (especially when on a handheld device)?