top | item 38887798

LLM Training and Inference with Intel Gaudi 2 AI Accelerators

36 points| sailplease | 2 years ago |databricks.com

8 comments

"Based on these public on-demand quoted prices from AWS and IDC, we found that the IntelR GaudiR 2 has the best training performance-per-dollar, with an average advantage of 4.8x vs the NVIDIA A100-80GB, 4.2x vs. the NVIDIA A100-40GB, and 5.19x vs. the NVIDIA H100"

ShamelessC|2 years ago

Seems there's some friction in porting software as you have to use their build of pytorch. They claim you just have to change your specified device in `.to(device:str)` statements but, if someone could verify that it would be appreciated. My experience with porting software to Google's TPU's or AMD GPU's has been not great.

ilaksh|2 years ago

I looked in their Intel Developer Cloud and saw the $10.42/hr 8x but there is no individual 1x Gaudi 2 there that I could see. The $1.30/hr could be okay for some inference use case though if it were available. Although for what I was thinking, llama.cpp is not going to work anyway.

remexre|2 years ago

Kinda funny that instead of NVLink, they're just using (presumably standard) 100GbE as their connector/protocol; wonder if this also lets you wire up larger and more complex topologies of these cards across servers using normal 100GbE switches