top | item 43597085

(no title)

mrajcok | 11 months ago

Yes - I'm able to run Llama 3.1 405B on 3x A6000 + 3x 4090.

Will have Llama 4 Maverick running in 4bit quantization (typically results in only minor quality degradation) once llama.cpp support is merged.

Total hardware cost well under $50,000.

The 2T Behemoth model is tougher, but enough Blackwell 6000 Pro cards (16) should be able to run it for under $200k.

discuss

order

No comments yet.