top | item 45413701

(no title)

nacs | 5 months ago

I don't know if it will stay this low but the whole point of v3.2 is to be cheaper to run than <= v3.1.

(The inference costs are cheaper for them now as context grows because of the Sparse attention mechanism)

discuss

No comments yet.