top | item 41872686 (no title) pajeets | 1 year ago need a 3090 at least for that discuss order hn newest kkielhofner|1 year ago llama.cpp and others can run purely on CPU[0]. Even production grade serving frameworks like vLLM[1].There are a variety of other LLM inference implementations that can run on CPU as well.[0] - https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#su...[1] - https://docs.vllm.ai/en/v0.6.1/getting_started/cpu-installat... pajeets|1 year ago wait this is crazywhat model can i run on 1TB and how many tokens per second ?for instance Nvidia Nemotron Llama 3.1 quantized at what speed ? ill get a GPU too but not sure how much VRAM I need for the best value for your buck load replies (1)
kkielhofner|1 year ago llama.cpp and others can run purely on CPU[0]. Even production grade serving frameworks like vLLM[1].There are a variety of other LLM inference implementations that can run on CPU as well.[0] - https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#su...[1] - https://docs.vllm.ai/en/v0.6.1/getting_started/cpu-installat... pajeets|1 year ago wait this is crazywhat model can i run on 1TB and how many tokens per second ?for instance Nvidia Nemotron Llama 3.1 quantized at what speed ? ill get a GPU too but not sure how much VRAM I need for the best value for your buck load replies (1)
pajeets|1 year ago wait this is crazywhat model can i run on 1TB and how many tokens per second ?for instance Nvidia Nemotron Llama 3.1 quantized at what speed ? ill get a GPU too but not sure how much VRAM I need for the best value for your buck load replies (1)
kkielhofner|1 year ago
There are a variety of other LLM inference implementations that can run on CPU as well.
[0] - https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#su...
[1] - https://docs.vllm.ai/en/v0.6.1/getting_started/cpu-installat...
pajeets|1 year ago
what model can i run on 1TB and how many tokens per second ?
for instance Nvidia Nemotron Llama 3.1 quantized at what speed ? ill get a GPU too but not sure how much VRAM I need for the best value for your buck