top | item 39480759

(no title)

rerx | 2 years ago

> "Basically nobody writes CUDA," wrote Keller in a follow-up post. "If you do write CUDA, it is probably not fast. […] There is a good reason there is Triton, Tensor RT, Neon, and Mojo."

> Even Nvidia itself has tools that do not exclusively rely on CUDA. For example, Triton Inference Server is an open-source tool by Nvidia that simplifies deploying AI models at scale, supporting frameworks like TensorFlow, PyTorch, and ONNX. Triton also provides features like model versioning, multi-model serving, and concurrent model execution to optimize the utilization of GPU and CPU resources.

> Nvidia's TensorRT is a high-performance deep learning inference optimizer and runtime library that accelerates deep learning inference on Nvidia GPUs. [...]

Keller was speaking of OpenAI's Triton (https://openai.com/research/triton), a Python-like language that is compiled to code for Nvidia GPUs, but Tom's Hardware mixed this up with Nvidia's Triton Inference Server, a higher level tool that's really not a replacement for CUDA and not directly related to the Triton language. Easy to confuse these if you are a writer in a hurry.

discuss

order

p1esk|2 years ago

Jim Keller works for Tenstorrent - direct Nvidia competitor.

diggan|2 years ago

Wow, that's some omission in the article. Mentioned in the very bottom, but with no disclaimer that it might influence his opinion as they're a competitor:

> His statements also imply that even though he has worked stints at some of the largest chipmakers in the world, including the likes of Apple, Intel, AMD, Broadcom (and now Tenstorrent), we might not see his name on the Nvidia roster any time soon.

shash|2 years ago

"Works for" is one way to put it - he's CEO and (I think?) co-founder..

londons_explore|2 years ago

Indeed - Keller is a low level hardware guy, and isn't going to have much interest in model versioning...

chrisjc|2 years ago

Isn't low-level hardware really at the heart of a lot of this? Hasn't a lot of the criticism of Cuda been that it's incredibly difficult for others to implement on other hardware bc of the low-level interactions and Nvidia's usage of dark-APIs (can't recall the term i've heard used).

Wasn't this one of the reasons AMD abandoned/deprioritized their efforts on such a project?

95014_refugee|2 years ago

Keller is a manager. He was a gateware engineer.