top | item 39156549 (no title) tacoman | 2 years ago I am using the exact same model. Ryzen 5600G w/32GB and an Nvidia P40 w/24GB VRAM20/33 layers offloaded to GPU, 4K context. Uses 25GB system RAM and all 24GB VRAM. 5-7 tokens per second. discuss order hn newest zaat|2 years ago Context is set to 32768, I didn't change it I guess that's the model's default.Thanks for making me feel better about investing in tht motherboard + CPU + RAM upgrade and deferring the GPU upgrade. attentive|2 years ago and Groq does 485.08 T/s on mixtral 8x7B-32kI am not sure local models have any future other than POC/research. Depends on the cost of course. tome|2 years ago (Groqster here) For anyone who wants to try it, you can go to https://chat.groq.com/ and choose Mixtral from the drop-down menu. Also, feel free to ask me any questions about Groq hardware or service.
zaat|2 years ago Context is set to 32768, I didn't change it I guess that's the model's default.Thanks for making me feel better about investing in tht motherboard + CPU + RAM upgrade and deferring the GPU upgrade.
attentive|2 years ago and Groq does 485.08 T/s on mixtral 8x7B-32kI am not sure local models have any future other than POC/research. Depends on the cost of course. tome|2 years ago (Groqster here) For anyone who wants to try it, you can go to https://chat.groq.com/ and choose Mixtral from the drop-down menu. Also, feel free to ask me any questions about Groq hardware or service.
tome|2 years ago (Groqster here) For anyone who wants to try it, you can go to https://chat.groq.com/ and choose Mixtral from the drop-down menu. Also, feel free to ask me any questions about Groq hardware or service.
zaat|2 years ago
Thanks for making me feel better about investing in tht motherboard + CPU + RAM upgrade and deferring the GPU upgrade.
attentive|2 years ago
I am not sure local models have any future other than POC/research. Depends on the cost of course.
tome|2 years ago