top | item 36686298 (no title) cygn | 2 years ago You can just use Triton which is basically TFserve for Tensorflow, Pytorch, Onnx and more. discuss order hn newest albertzeyer|2 years ago Can you explain that?My understand of Triton is more that this is an alternative to CUDA, but instead you write it directly in Python, and on a slightly higher-level, and it does a lot of optimizations automatically. So basically: Python -> Triton-IR -> LLVM-IR -> PTX.https://openai.com/research/triton chillee|2 years ago It's confusing, there's OpenAI Triton (what you're thinking of) and Nvidia Triton server (a different thing). load replies (1)
albertzeyer|2 years ago Can you explain that?My understand of Triton is more that this is an alternative to CUDA, but instead you write it directly in Python, and on a slightly higher-level, and it does a lot of optimizations automatically. So basically: Python -> Triton-IR -> LLVM-IR -> PTX.https://openai.com/research/triton chillee|2 years ago It's confusing, there's OpenAI Triton (what you're thinking of) and Nvidia Triton server (a different thing). load replies (1)
chillee|2 years ago It's confusing, there's OpenAI Triton (what you're thinking of) and Nvidia Triton server (a different thing). load replies (1)
albertzeyer|2 years ago
My understand of Triton is more that this is an alternative to CUDA, but instead you write it directly in Python, and on a slightly higher-level, and it does a lot of optimizations automatically. So basically: Python -> Triton-IR -> LLVM-IR -> PTX.
https://openai.com/research/triton
chillee|2 years ago