top | item 44683170 (no title) alexellman | 7 months ago They all tokenize a little differently so they are not exactly 1-1. However I plan on addressing this by having each model complete a test task and getting the actual price from each api + token count to make a real 1-1 comparison. discuss order hn newest esafak|7 months ago And please timestamp the benchmarks, and rerun them periodically, so vendors can't quietly cost optimize the model when no-one's looking. nisegami|7 months ago Ah, that's a great idea and would be a welcome addition to the site.
esafak|7 months ago And please timestamp the benchmarks, and rerun them periodically, so vendors can't quietly cost optimize the model when no-one's looking.
esafak|7 months ago
nisegami|7 months ago