top | item 38544939 (no title) thatcherthorn | 2 years ago They've reported surpassing GPT4 on several benchmarks. Does anyone know of these are hand picked examples or is this the new SOTA? discuss order hn newest xiphias2|2 years ago It will be SOTA maybe when Gemini Ultra is available. GPT-4 is still SOTA. philomath_mn|2 years ago Usually SOTA status is established when the benchmark paper is released (probably after some review). But GPT4 is the current generally-available-SOTA silveraxe93|2 years ago They also compare to RLHFed GPT-4, which reduces capabilities, while their model seems to be pre-RLHF. So I'd expect those numbers to be a bit inflated compared to public release. williamstein|2 years ago They certainly claim it is SOTA for multimodal tasks: “Gemini surpasses SOTA performance on all multimodal tasks.”
xiphias2|2 years ago It will be SOTA maybe when Gemini Ultra is available. GPT-4 is still SOTA. philomath_mn|2 years ago Usually SOTA status is established when the benchmark paper is released (probably after some review). But GPT4 is the current generally-available-SOTA silveraxe93|2 years ago They also compare to RLHFed GPT-4, which reduces capabilities, while their model seems to be pre-RLHF. So I'd expect those numbers to be a bit inflated compared to public release.
philomath_mn|2 years ago Usually SOTA status is established when the benchmark paper is released (probably after some review). But GPT4 is the current generally-available-SOTA
silveraxe93|2 years ago They also compare to RLHFed GPT-4, which reduces capabilities, while their model seems to be pre-RLHF. So I'd expect those numbers to be a bit inflated compared to public release.
williamstein|2 years ago They certainly claim it is SOTA for multimodal tasks: “Gemini surpasses SOTA performance on all multimodal tasks.”
xiphias2|2 years ago
philomath_mn|2 years ago
silveraxe93|2 years ago
williamstein|2 years ago