top | item 44806962

(no title)

jnmandal | 6 months ago

Honestly, I think it just depends. A few hours ago I wrote I would never want it for a production setting but actually if I was standing something up myself and I could just download headless ollama and know it would work. Hey, that would also be fine most likely. Maybe later on I'd revisit it from a devops perspective, and refactor deployment methodology/stack, etc. Maybe I'd benchmark it and realize its fine actually. Sometimes you just need to make your whole system work.

We can obviously disagree with their priorities, their roadmap, the fact that the client isn't FOSS (I wish it was!), etc but no one can say that ollama doesn't work. It works. And like mchiang said above: its dead simple, on purpose.

discuss

dcreater|6 months ago

But its effectively equally easy to do the same with llama.cpp, vllm or modular..

(any differences are small enough that they either shouldn't cause the human much work or can very easily be delegated to AI)

evilduck|6 months ago

Llama.cpp is not really that easy unless you're supported by their prebuilt binaries. Go to the llama.cpp GitHub page and find a prebuilt CUDA enabled release for a Fedora based linux distro. Oh there isn't one you say? Welcome to losing an hour or more of your time.

Then you want to swap models on the fly. llama-swap you say? You now get to learn a new custom yaml based config file syntax that does basically nothing that the Ollama model file already does so that you can ultimately... have the same experience as Ollama but now you've lost hours just to get back to square one.

Then you need it to start and be ready with the system reboot? Great, now you get to write some systemd services, move stuff into system-level folders, create some groups and users and poof, there goes another hour of your time.

jnmandal|6 months ago

Sure but if my some of the development team is using ollama locally b/c it was super easy to install, maybe I don't want to worry about maintaining a separate build chain for my prod env. Many startups are just wrapping or enabling LLMs and just need a running server. Who are we to say what is right use of their time and effort?