top | item 45413701 (no title) nacs | 5 months ago I don't know if it will stay this low but the whole point of v3.2 is to be cheaper to run than <= v3.1.(The inference costs are cheaper for them now as context grows because of the Sparse attention mechanism) discuss order hn newest No comments yet.
No comments yet.