top | item 39895807

(no title)

keriati1 | 1 year ago

We run coding assistance models on MacBook Pros locally, so here is my experience: On hardware side I recommend Apple M1 / M2 / M3 with at least 400Gb/s memory bandwidth. For local coding assistance this is perfect for 7B or 33B models.

We also run a Mac Studio with a bigger model (70b), M2 ultra and 192GB ram, as a chat server. It's pretty fast. Here we use Open WebUI as interface.

Software wise Ollama is OK as most IDE plugins can work with it now. I personally don't like the go code they have. Also some key features are missing from it that I would need and those are just never getting done, even as multiple people submitted PRs for some.

LM Studio is better overall, both as server or as chat interface.

I can also recommend CodeGPT plugin for JetBrains products and Continue plugin for VSCode.

As a chat server UI as I mentioned Open WebUI works great, I use it with together ai too as backend.

discuss

isoprophlex|1 year ago

An M2 ultra with 192 gb isn't cheap, did you have it lying around for whatever reason or do you have some very solid business case for running the model locally/on prem like that?

Or maybe I'm just working in cash poor environments...

Edit: also, can you do training / finetuning on an m2 like that?

keriati1|1 year ago

We had some as build agent around already. We don't plan to do any fine tuning or training, so we did not explore this at all. However I don't think it is a viable option.

shostack|1 year ago

Can the Continue plugin handle multiple files in a directory of code?