(no title)
calebkaiser | 6 months ago
A lot of people are excited about the Qwen3-Coder family of models: https://huggingface.co/collections/Qwen/qwen3-coder-687fc861...
For running locally, there are tools like Ollama and LM Studio. Your hardware needs will fluctuate depending on what size/quantization of model you try to run, but 2k in hardware cost is reasonable for running a lot of models. Some people have good experiences using the M-series Macs, which is probably a good bang-for-buck if you're exclusively interested in inference.
I'd recommend checking out the LocalLlamas subreddit for more: https://www.reddit.com/r/LocalLLaMA/
Getting results on par with big labs isn't feasible, but if you prefer to run everything locally, it is a fun and doable project.
megaloblasto|6 months ago
Is this just a fun project for now, or could I actually benefit from it in terms of software production like I do with tools like claude code?
I am interested in carefully tailoring it to specific projects, integrating curated personal notes, external documentation, scientific papers, etc via RAG (this part I've already written), and carefully chosing the tools available to the agent. If I hand tailor the AI agents to each project, can I expect to get something perhaps similar to the performance boost of Claude code for $2000 (USD)?
If not $2000, then how much would I need? I'm pretty sure for something like $75000 I could do this with a large deep seek model locally, and certainly get something very close to claude code, right?
megaloblasto|6 months ago
https://www.youtube.com/watch?v=e-EG3B5Uj78&t=237s
"Run Deepseek R1 at Home on Hardware from $250 to $25,000: From Installation to Questions"
You can run Deepseek R1 with over 671B parameters at 4 T/s for ~$25,000 (USD). With AMD Ryzen™ Threadripper™ PRO 7995WX 96-Core, 192-Thread Processor and PNY NVIDIA RTX PRO 6000