Crucially, I want to understand the license that applies to the search results. Can I store them, can I re-publish them? Different providers have different rules about this.
We work with search providers and ensure that we have zero data retention policies in place.
The search results are yours to own and use. You are free to do what you want with it. Of course you are bound by local laws of the legal jurisdiction you are in.
It is strange to launch this type of functionality with not even a privacy policy in place.
It makes me wonder if they’ve partnered with another of their VC’s peers who’s recently had a cash injection, and they’re being used as a design partner/customer story.
Exa would be my bet. YC backed them early, and they’ve also just closed a $85M Series B. Bing would be too expensive to run freely without Microsoft partnership.
Get on that privacy notice soon, Ollama. You’re HQ’d in CA, you’re definitely subject to CCPA. (You don’t need revenue to be subject to this, just being a data controller for 50,000 Californian residents is enough.)
There are very few recently launched pure open source projects these days (most are at least running donation-ware models or funded by corporate backers), none in the AI space that I'm aware of.
I was hoping for more details about their implementation, I saw ollama as the open source // platform agnostic tool but I worry their recent posturing is going against that
We did consider building functionality into Ollama that would go fetch search results and website contents using a headless browser or similar. However we had a lot of worries about result quality and also IP blocking from Ollama creating crawler-like behavior. Having a hosted API felt like a fast path to get results into users' context window, but we are still exploring the local option. Ideally you'd be able to stay fully local if you want to (even when using capabilities like search)
Their GUI is closed-source. If someone wants an easy to use & easy to setup app, may as well use LMStudio, which doesn't try to pretend to be OSS. Or use ramalama which is basically just containerizing LLMs and the relevant bits, pretty damn similar to ollama. Or just go back to "basics" and use llama.cpp or vllm.
I had no idea they had their own cloud offering, I thought the whole point of Ollama was local models? Why would I pay $20/month to use small inferior models instead of using one of the usual AI companies like OpenAI or even Mistral? I'm not going to make an account to use models on my own computer.
Fair question. Some of the supported models are large and wouldn't fit on most local devices. This is just the beginning, and Ollama does not need to exclude cloud hosted frontier models either with the relationship we've built with the model providers. We just have to be mindful and understand that Ollama stands with developers, and solve the needs.
Yeah it's been a steady pivot to profitable features. Wonderful to see them build a reputation through FOSS and codebase from free labor to then cash in.
You make an account to use their hosted models AND to have them available via the Ollama API LOCALLY. I'm spending $100 on Claude and $200 on GPT5, so $20 bucks is NOTHING and totally worth having access to:
Qwen3 235b
Deepseek 3.1 671b (thinking and non thinking)
Llama 3.1 405b
GPT OSS 120b
Those are hardly "small inferior models".
What is really cool is that you can set Codex up to use Ollama's API and then have it run tools on different models.
a lot of "local" models are still very large to download and slow to run on regular hardware. I think it's great to have a way to evaluate them cheaply in the cloud before deciding to pull down the model to run locally.
At some level it's also more of a principle that I could run something locally that matters rather than actually doing it. I don't want to become dependent on technology that someone could take away from me.
Looks like Ollama is focusing more and more on non-local offerings. Also their performance is worse than say vLLM.
What's a good Ollama alternative (for keeping 1-5x RTX 3090 busy) if you want to run things like open-webui (via an OpenAI compatible API) where your users can choose between a few LLMs?
I've been thinking about building a home-local "mini-Google" that indexes maybe 1,000 websites. In practice, I rarely need more than a handful of sites for my searches, so it seems like overkill to rely on full-scale search engines for my use case.
My rough idea for architecture:
- Crawler: A lightweight scraper that visits each site periodically.
- Indexer: Convert pages into text and create an inverted index for fast keyword search. Could use something like Whoosh.
- Storage: Store raw HTML and text locally, maybe compress older snapshots.
- Search Layer: Simple query parser to score results by relevance, maybe using TF-IDF or embeddings.
I would do periodic updates and build a small web UI to browse.
Have you ever looked at Common Crawl dumps? I did a bit of data mining and holy cow is 99.99% of the web crap. Spam, porn, ads, flame wars, random blogs by angsty teens... I understand it has historical and cultural value — and maybe literary value, in a Douglas Coupland kind of way — but for my purposes, there was very little here that I considered of interest.
Which was very encouraging to me, because it implies that indexing the Actually Important Web Pages might even be possible for a single person on their laptop.
Wikipedia, for comparison, is only ~20GB compressed. (And even most of that is not relevant to my interests, e.g. the Wikipedia articles related to stuff I'd ever ask about are probably ~200MB tops.)
Drew DeVault tried building something similar to this under the name SearchHut, but the project was abandoned [1]. I tried hacking on it a while ago (since it's built on Postgres and a bit of Go), but I ran out of steam trying to understand the Postgres RUM extension.
Perhaps not quite solving your problem, but I have a handful of domain-specific Google CSE (Custom Search Engine) that limit the results to predefined websites. I summon them from Alfred with short keywords when I'm doing interest-specific searches.
https://blog.gingerbeardman.com/2021/04/20/interest-specific...
Reminds me of building a Obsidian vault with all the content in markdown form. There's also plugins to show vault results when doing a Google search, making notes within your vault show up before external websites.
To provide additional features or using Ollama's cloud hosted models, you can signup for an Ollama account.
For starter, this is completely optional. It can be completely local too for you to publish your own models to ollama.com that you can share with others.
I like using ollama locally and I also index and query locally.
I would love to know how to hook ollama up to a traditional full-text-search system rather than learning how to 'fine tune' or convert my documents into embeddings or whatnot.
A slightly heavier lift, but only slightly, would be to also use solr to also store a vectorized version of your docs and simultaneously do vector similarity search, solr has built in knn support fort it. Pretty good combo to get good quality with both semantic and full-text search.
Though I’m not sure if it would be relatively similar work to do solr w/ chromadb, for the vector portion, and marry the result stewards via llm pixie dust (“you are the helpful officiator of a semantic full-text matrimonial ceremony” etc). Also not sure the relative strengths of chromadb vs solr on that- maybe scales better for larger vector stores?
I added search to my LLMs years ago with the python DuckDuckGo package.
However I found that Google gives better results, so I switched to that. (I forget exactly but I had to set up something in a Google dev console for that.)
I think the DDG one is unofficial, and the Google one has limits (so it probably wouldn't work well for deep research type stuff).
I mostly just pipe it into LLM apis. I found that "shove the first few Google results into GPT, followed by my question" gave me very good results most of the time.
It of course also works with Ollama, but I don't have a very good GPU, so it gets really slow for me on long contexts.
I am just working on a tool using websearch and iterating over different providers.
openAI, xAI, gemini all suffer from not being allowed on respective competitor sites.
this searched works for me with some quick tests well on YT videos, which OpenAI web search can't access. It kind of failed on X but sometimes returned ok relevant results. Definitely hit and miss but on average good
How is that any different than someone installing an ad blocker in their browser? Arguably ad blocker is much simpler technology than running a local LLM and has been available for years now. And yet Google’s ad revenue seems to have remained unaffected.
There are millions of websites, and a local LLM cannot scrape all of them to make sense of them. Think about it. OpenAI can do it because they spend millions to train its systems.
Many sites have hidden sitemaps that cannot be found unless submitted to google directly. (Not even listed in robots txt most of the time). There is no way a local LLM can keep up with up to date internet.
I think because Google knows traditional search is gonna die, they will be aggressively pushing ads on traditional search to extract as much money as possible till they figure out newer ways of making money.
This is a nice first step - web search makes sense, and it’s easy to imagine other tools being added next: filesystem, browser, maybe even full desktop control. Could turn Ollama into more than just a model runner. Curious if they’ll open up a broader tool API for third-party stuff too
Hey! Author of the blogpost and I also work on Ollama's tool calling. There has been a big push on tool calling over the last year to improve the parsing. What's the issues you're running into with local tool use? What models are you using?
Based on the fact that there are very few up-to-date English-language search indexes (Google, Bing, and Brave if you count it), it must be incredibly costly. I doubt they are maintaining their own.
I'm looking to use web search in production, but they haven't mentioned the price. Only thing that's mentioned is $20/month, but how much quota does it include?
Sorry about this. We are working really hard on providing a usage based pricing.
During the preview period we want to start offering a $20 / month plan tailored for individuals - and we are monitoring the usage and making changes as people hit rate limits so we can satisfy most use cases, and be generous.
AgenticSeek, or you can get pretty far with local qwen and Playwright-Stealth or SeleniumBase integrated directly into your Chrome (running with Chrome DevTools Protocol enabled).
I was pleasantly surprised on the model improvements when testing this feature.
For smaller models, it can augment it with the latest data by fetching it from the web, solving the problem of smaller models lacking specific knowledge.
For larger models, it can start functioning as deep research.
Your regular reminder that you don't need ollama to get a quick chat engine on the command line, you can just do this with pretty much any major model on huggingface:
simonw|5 months ago
Crucially, I want to understand the license that applies to the search results. Can I store them, can I re-publish them? Different providers have different rules about this.
mchiang|5 months ago
The search results are yours to own and use. You are free to do what you want with it. Of course you are bound by local laws of the legal jurisdiction you are in.
kingnothing|5 months ago
userbinator|5 months ago
apimade|5 months ago
It makes me wonder if they’ve partnered with another of their VC’s peers who’s recently had a cash injection, and they’re being used as a design partner/customer story.
Exa would be my bet. YC backed them early, and they’ve also just closed a $85M Series B. Bing would be too expensive to run freely without Microsoft partnership.
Get on that privacy notice soon, Ollama. You’re HQ’d in CA, you’re definitely subject to CCPA. (You don’t need revenue to be subject to this, just being a data controller for 50,000 Californian residents is enough.)
https://oag.ca.gov/privacy/ccpa
I can imagine the reaction if it turns out the zero-retention provider backing them ended up being Alibaba.
andrewmutz|5 months ago
I wonder how they plan to monetize their users. Doesn't sound promising.
blihp|5 months ago
coolspot|5 months ago
Havoc|5 months ago
lynnharry|5 months ago
MisterBiggs|5 months ago
jmorgan|5 months ago
wirybeige|5 months ago
dcreater|5 months ago
sorenjan|5 months ago
mchiang|5 months ago
https://ollama.com/cloud
dcreater|5 months ago
kordlessagain|5 months ago
Qwen3 235b
Deepseek 3.1 671b (thinking and non thinking)
Llama 3.1 405b
GPT OSS 120b
Those are hardly "small inferior models".
What is really cool is that you can set Codex up to use Ollama's API and then have it run tools on different models.
ricardobeat|5 months ago
zmmmmm|5 months ago
At some level it's also more of a principle that I could run something locally that matters rather than actually doing it. I don't want to become dependent on technology that someone could take away from me.
Tepix|5 months ago
What's a good Ollama alternative (for keeping 1-5x RTX 3090 busy) if you want to run things like open-webui (via an OpenAI compatible API) where your users can choose between a few LLMs?
kgeist|4 months ago
200 weekly users :)
Ey7NFZ3P0nzAe|5 months ago
coffeecoders|5 months ago
I've been thinking about building a home-local "mini-Google" that indexes maybe 1,000 websites. In practice, I rarely need more than a handful of sites for my searches, so it seems like overkill to rely on full-scale search engines for my use case.
My rough idea for architecture:
- Crawler: A lightweight scraper that visits each site periodically.
- Indexer: Convert pages into text and create an inverted index for fast keyword search. Could use something like Whoosh.
- Storage: Store raw HTML and text locally, maybe compress older snapshots.
- Search Layer: Simple query parser to score results by relevance, maybe using TF-IDF or embeddings.
I would do periodic updates and build a small web UI to browse.
Anyone tried it or are there similar projects?
andai|5 months ago
Which was very encouraging to me, because it implies that indexing the Actually Important Web Pages might even be possible for a single person on their laptop.
Wikipedia, for comparison, is only ~20GB compressed. (And even most of that is not relevant to my interests, e.g. the Wikipedia articles related to stuff I'd ever ask about are probably ~200MB tops.)
harias|5 months ago
fabiensanglard|5 months ago
UltimateEdge|5 months ago
[1]: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
msephton|5 months ago
mrkeen|5 months ago
Crawling was tricky. Something like stackoverflow will stop returning pages when it detects that you're crawling, much sooner than you'd expect.
_flux|5 months ago
matsz|5 months ago
bryanhogan|5 months ago
computerex|5 months ago
toephu2|5 months ago
drnick1|5 months ago
mchiang|5 months ago
For starter, this is completely optional. It can be completely local too for you to publish your own models to ollama.com that you can share with others.
mrkeen|5 months ago
I like using ollama locally and I also index and query locally.
I would love to know how to hook ollama up to a traditional full-text-search system rather than learning how to 'fine tune' or convert my documents into embeddings or whatnot.
ineedasername|5 months ago
https://github.com/mjochum64/mcp-solr-search
A slightly heavier lift, but only slightly, would be to also use solr to also store a vectorized version of your docs and simultaneously do vector similarity search, solr has built in knn support fort it. Pretty good combo to get good quality with both semantic and full-text search.
Though I’m not sure if it would be relatively similar work to do solr w/ chromadb, for the vector portion, and marry the result stewards via llm pixie dust (“you are the helpful officiator of a semantic full-text matrimonial ceremony” etc). Also not sure the relative strengths of chromadb vs solr on that- maybe scales better for larger vector stores?
all2|5 months ago
andai|5 months ago
However I found that Google gives better results, so I switched to that. (I forget exactly but I had to set up something in a Google dev console for that.)
I think the DDG one is unofficial, and the Google one has limits (so it probably wouldn't work well for deep research type stuff).
I mostly just pipe it into LLM apis. I found that "shove the first few Google results into GPT, followed by my question" gave me very good results most of the time.
It of course also works with Ollama, but I don't have a very good GPU, so it gets really slow for me on long contexts.
ivape|5 months ago
thomastraum|5 months ago
openAI, xAI, gemini all suffer from not being allowed on respective competitor sites.
this searched works for me with some quick tests well on YT videos, which OpenAI web search can't access. It kind of failed on X but sometimes returned ok relevant results. Definitely hit and miss but on average good
riskable|5 months ago
onesociety2022|5 months ago
tartoran|5 months ago
system2|5 months ago
Many sites have hidden sitemaps that cannot be found unless submitted to google directly. (Not even listed in robots txt most of the time). There is no way a local LLM can keep up with up to date internet.
cantor_S_drug|5 months ago
thimabi|5 months ago
It takes lots of servers to build a search engine index, and there’s nothing to indicate that this will change in the near future.
Havoc|5 months ago
andrewmcwatters|5 months ago
frabonacci|5 months ago
yggdrasil_ai|5 months ago
parthsareen|5 months ago
throwaway12345t|5 months ago
tripplyons|5 months ago
chrisshroba|5 months ago
Havoc|5 months ago
lgats|5 months ago
jerrygoyal|5 months ago
mchiang|5 months ago
During the preview period we want to start offering a $20 / month plan tailored for individuals - and we are monitoring the usage and making changes as people hit rate limits so we can satisfy most use cases, and be generous.
enoch2090|5 months ago
anonyonoor|5 months ago
Like a full search engine that can visit pages on your behalf. Is anyone building this?
apimade|5 months ago
not_really|5 months ago
dumbmrblah|5 months ago
kordlessagain|5 months ago
lxgr|5 months ago
parthsareen|5 months ago
yggdrasil_ai|5 months ago
chungus42|5 months ago
mchiang|5 months ago
For smaller models, it can augment it with the latest data by fetching it from the web, solving the problem of smaller models lacking specific knowledge.
For larger models, it can start functioning as deep research.
kgeist|5 months ago
tempodox|5 months ago
alberth|5 months ago
Or is this just someone trying to monetize Meta open source models?
mchiang|5 months ago
https://github.com/ollama/ollama
nextworddev|5 months ago
typpilol|5 months ago
Even with heavy ai usage I'm only at like 400/1000 for the month
orliesaurus|5 months ago
Cheer2171|5 months ago
pip install transformers
transformers chat Qwen/Qwen2.5-0.5B-Instruct
bigyabai|5 months ago
Dead on arrival. Thanks for playing, Ollama, but you've already done the leg work in obsoleting yourself.
disiplus|5 months ago
timothymwiti|5 months ago
mmaunder|5 months ago
tripplyons|5 months ago
disiplus|5 months ago
mchiang|5 months ago
unknown|5 months ago
[deleted]
Boristoledano|5 months ago
[deleted]
mostMoralPoster|5 months ago
[deleted]