top | item 40298182

(no title)

dusanh | 1 year ago

I'm a complete newb when it comes to AI, and I am getting pretty ashamed of it too. How do I take a model like this and use it in my day to day? Can I somehow use in, say, VSCode? How do I point it at my code base, and use it to help me write new code?

discuss

everforward|1 year ago

You run most of these models in something that wraps them in an HTTP API. I use Ollama, which I think is the most popular but I’m not in a great position to judge. My impression is that it handles running models on CPU better.

So you’d basically install Ollama, download one of the versions of this model off HuggingFace, create a Modelfile since this isn’t in the default Ollama repo, and then Ollama can answer prompts with the model. Modelfiles are very simple, based on Dockerfiles. It takes like 15 seconds to make one if you aren’t messing with the various parameters.

Once it’s in Ollama, just get one of the various GPT plugins for VSCode and give it the Ollama URL (http://localhost:11434 by default). I use continue.dev but there are many.

Continue takes over the tab autocomplete with the LLM, and has a chat window on the right where you can use keyboard shortcuts to copy code into the prompt and ask it to edit/generate code or ask questions about existing code.

homarp|1 year ago

if you can compile stuff, then looking at llama.cpp (what ollama uses) is also interesting: https://github.com/ggerganov/llama.cpp

the server is here: https://github.com/ggerganov/llama.cpp/tree/master/examples/...

And you can search for any GGUF on huggingface

dusanh|1 year ago

Thank you so much! That sounds surprisingly straightforward. I expected a lot more fiddling to get going.

Where would I start if I wanted to use a model programmatically ? Like let's say I am building a chat bot. I have a large data set of replies I want the model to mimic, and I'd want to do this in Python. Of course, I'd probably use a different model than Granite.

huijzer|1 year ago

https://github.com/TabbyML/tabby can run self-hosted AI coding assistants. I tried it a while ago and it worked with Nvim pretty easily. There is a VS code extension too. The extension will just sort of "read" with you and provide suggestions from time to time. Anytime the suggestion is good you can press some key (<TAB> by default) to accept it. It's basically autocomplete on steroids.

mark_l_watson|1 year ago

If you like Emacs (I use both Emacs and VSCode, for slightly different coding use cases), then the Emacs elamma [1]package is very nice. It is set up out of the box to use Ollama and to use M-x commands for code completion, summarization, and dozens of other useful functions. I love it, your mileage may vary.

[1] https://github.com/s-kostyaev/ellama