Is it 0.003 per minute of audio uploaded, or "compute minute"?
For example fal.ai has a Whisper API endpoint priced at "$0.00125 per compute second" which (at 10-25x realtime) is EXTREMELY cheaper than all the competitors.
If Voxtral can process rapid speech as well as it claims to, an obvious cost optimization would be to speed up normal laconic speech to the maximum speed the model can handle accurately.
Oras|25 days ago
jamilton|25 days ago
85392_school|24 days ago
tgrowazay|25 days ago
Curiositry|23 days ago