top | item 46464159

(no title)

port3000 | 1 month ago

The 'flash' / no or low-thinking versions of those models are crazy fast. We often receive full response (not just first token) in less than 1 second via API.

discuss

No comments yet.