The state of "super fast inference" is frustrating
4 points| 4k | 1 year ago
However, for many months now, each of those providers have literally the same API page where only the Free option with low rates is available. Everything else is "Coming Soon". No updates, no dates, no estimates, nothing.
Come to think of it, there is not a single good inference provider in the whole open source models space that offers a paid API without throttle in over 50 tps consistently. There's money to be made here and surprisingly nobody is doing it aggressively
[+] [-] imdoxxingme|1 year ago|reply