top | item 37249662

(no title)

How are people using these local code models? I would much prefer using these in-context in an editor, but most of them seem to be deployed just in an instruction context. There's a lot of value to not having to context switch, or have a conversation.

I see the GitHub copilot extensions gets a new release one every few days, so is it just that the way they're integrated is more complicated so not worth the effort?

discuss

sestinj|2 years ago

You can use Continue as a drop-in replacement for Copilot Chat with Code Llama. We've released a short tutorial here: https://continue.dev/docs/walkthroughs/codellama. It should save you a lot of time context-switching; you can just highlight code and ask questions or make edits, all with keyboard shortcuts

thewataccount|2 years ago

For in-editor like copilot you can try this locally - https://github.com/smallcloudai/refact

This works well for me except the 15B+ don't run fast enough on a 4090 - hopefully exllama supports non-llama models, or maybe it'll support CodeLLaMa already I'm not sure.

For general chat testing/usage this works pretty well with lots of options - https://github.com/oobabooga/text-generation-webui/

msp26|2 years ago

>This works well for me except the 15B+ don't run fast enough on a 4090

I assume quantized models will run a lot better. TheBloke already seems like he's on it.

https://huggingface.co/TheBloke/CodeLlama-13B-fp16

modeless|2 years ago

http://cursor.sh integrates GPT-4 into vscode in a sensible way. Just swapping this in place of GPT-4 would likely work perfectly. Has anyone cloned the OpenAI HTTP API yet?

lhl|2 years ago

LocalAI https://localai.io/ and LMStudio https://lmstudio.ai/ both have fairly complete OpenAI compatibility layers. llama-cpp-python has a FastAPI server as well: https://github.com/abetlen/llama-cpp-python/blob/main/llama_... (as of this moment it hasn't merged GGUF update yet though)

fudged71|2 years ago

I was tasked with a massive project over the last month and I'm not sure I could have done it as fast as I have without Cursor. Also check out the Warp terminal replacement. Together it's a winning combo!