top | item 43827140 (no title) mmoskal | 10 months ago Just for some callibration: approx. no one runs 32 bit for LLMs on any sort of iron, big or otherwise. Some models (eg DeepSeek V3, and derivatives like R1) are native FP8. FP8 was also common for llama3 405b serving. discuss order hn newest No comments yet.
No comments yet.