top | item 42217867 (no title) guptadagger | 1 year ago Speaking of ChatGPT getting worse over time, it would be interesting to see ChatGPT be benchmarked continuously to see how it performs over time (and the results published somewhere publically).Even local variations would be interesting discuss order hn newest arnaudsm|1 year ago https://livebench.ai/ does that, the latest gpt4o underperforms previous versions significantly
arnaudsm|1 year ago https://livebench.ai/ does that, the latest gpt4o underperforms previous versions significantly
arnaudsm|1 year ago