top | item 41833932

(no title)

Its weird, i looked up whether AMD has any benchmarks on the 405B for the MI300x, and came across this one -- https://dstack.ai/blog/amd-mi300x-inference-benchmark/#token...

From my understanding, it can get up to around 2500 tokens/s? Both are 8x units (h200 and MI300x)

discuss

No comments yet.