top | item 45363220

(no title)

sdesol | 5 months ago

Honestly Gemini Flash Lite and models on Cerebras are extremely fast. I know what you are saying. If the goal is to get a lot of results where they may or may not be relevant, then yes, it is an order of a magnitude slower.

If you take into consideration the post analysis process, which is what inference is trying to solve, is it an order of a magnitude slower?

discuss

adastra22|5 months ago

More like 6-8 orders of magnitude slower. That’s a very nontrivial difference in performance!

sdesol|5 months ago

How are you quantify the speed at which results are reviewed?