(no title)
ozgune | 6 months ago
Just in case you have $3-4M lying around somewhere for some high quality inference. :)
SGLang quotes a 2.5-3.4x speedup as compared to the H100s. They also note that more optimizations are coming, but they haven't yet published a part 2 on the blog post.
aurareturn|6 months ago