top | item 41393475

Dawn of a new era in Search: Balancing innovation, competition, and public good

253 points| sbeckeriv | 1 year ago |blog.kagi.com

79 comments

order

cube2222|1 year ago

I've been using Kagi for a while (almost two years now!) and it's been nothing but excellent!

Lenses are very useful (Reddit lens is on every second search), and I personally really like the AI features they are working on.

The quick assist triggered by a question mark at the end of a search query which makes a quick ai-generated summary of the few top results is something I use constantly.

The new more advanced assistant which is able to do searches, which can also be constrained to lenses, and lets you pick an arbitrary model, is also excellent, and basically means I don't need a chatgpt/claude subscription, as Kagi covers it very well.

All in all, great product which I'm happy to pay for.

baybayblonde|1 year ago

What is "Lenses", is it the Google Lens?

Fire-Dragon-DoL|1 year ago

I don't have the advanced assistant. Is it only on ultimate?

adamcharnock|1 year ago

> The Google Search Index is a unique and irreplaceable resource within the digital ecosystem. Mandating fair access to it or treating it as an essential facility could address the core issues...

The article estimates the Google Search Index at 12.5PB. If Kagi thinks that is a big enough moat to be the primary target then, well, I suppose they should know. But I'm also skeptical. You could fit that on about 50 Hetzner SX295, so about $20k/month. Plus the cost of gathering the data. It is surely a huge resource.

But weighed against the combination of Google Search + AdWords + Android + YouTube + Chrome, all in a single company? To me a 12.5PB search index feels like small change in comparison.

NB: Happy Kagi-paying customer here.

freediver|1 year ago

> The article estimates the Google Search Index at 12.5PB.

I realize there was a mistake with the estimated number (thanks for pointing out, should be closer to 180 PB for raw crawl data). Since this is speculative and also does not account for other data needed to actually rank pages, hardware to do it in under 500ms at a scale of billions of queries per day and thus can be misleading in terms of true effort to do it, I edited that datapoint out of the article.

You are right, just crawling large number of pages (millions even billions) is indeed straightforward (eg [1]), it is about creating a searchable index of the web scale that has certain quality level that is simply impossible to do anymore for many reasons that would require another article to explain. Microsoft spent $100bn and last 20 years by their own account trying to match it and most people agree it is still not even close. At some point you reach diminishing returns. To use the analogy from the article, it is akin to someone trying to rebuild all of the US railroad network today. Sounds plausible, but not really in practice. That train has left the station in early 2000s.

[1] https://michaelnielsen.org/ddi/how-to-crawl-a-quarter-billio...

IgorPartola|1 year ago

This puts it in enough perspective for me to ask: why doesn’t a university create a public/open source search index? Seems like a way to get a ton of attention.

Moreover, archive.org has all the data and data storage capabilities many times over. What prevents them from creating an open source search engine?

arnaudsm|1 year ago

I don't buy this number. Text-only common crawl is 20TB. Remove spam and dupes, you're around <10TB of current useful data. Which you can parse and index on a single server nowadays.

It's the full Google index history with full HTML that is probably 12PB, but the useful part of the search engine is much smaller.

dmonitor|1 year ago

I assume that the major hurdle is not storing an equivalently-sized search index, but building one from scratch. Crawling takes time, and Google has had a many years head start.

ldayley|1 year ago

I've been using (and occasionally paying) Kagi on and off for a couple of years now. I truly think they're building something interesting and valuable! While I haven't agreed with every product decision they've made, the founder is very good at both understanding his business and also explaining their decisions. This is a well crafted explainer of the search business and the monopoly case-- much better for sharing with less tech-savvy peers than most mainstream media explainers on this subject!

Edit: wording

Edit 2: Can you imagine a world where Google's Internet Search Index is legally considered an "Essential Facility"!? https://law.stanford.edu/publications/essential-platforms/

somethoughts|1 year ago

I think its instructive to look at the early history of Google and Facebook. In the early years they did not really turn on the ad revenue levers and just focused on increasing users (i.e. Don't Be Evil) - until a decade after offering their respective services.

Similarly Netflix is just now starting the ad revenue model after years of only subscription based services.

Eventually the temptation for multiple sources of revenue (i.e. subscription AND advertising) will likely be too great due to:

- IPO and Wall Street demands net income growth (i.e. FB/Google)

- Private Equity buys the company and needs to pay back leveraged debt

- The number of customers willing to offer up a credit card for Search stagnates and a lower cost ad tier appears and the ad infrastructure that is built is applied to the paid tiers

pennybanks|1 year ago

i honestly cant imagine paying for a search engine with whats out there for free

throwaway14356|1 year ago

> Google has built a massive index of the internet that covers close to 100% of the accessible web.

While their index (of other peoples stuff) is enormous it far from includes everything. It is easy to disqualify and people would be screaming if content farms would be included. What even is a content farm nowadays? One can return a reasonable article for any query with llms rich in links to other pages that don't exist but could be indexed and are part of the accessible web

If you make a new website with a few thousand pages and a few thousand images it takes quite a while for google to pick up the entire thing, if it even bothers to.

google tries to fill the result page with a small subset of websites. A good thing for users most of the time and the easiest ad money but horrible for new players.

it use to be quite common for bloggers (and others) to follow everything written about them or of interest. google (blog search) and technorati were very useful for that kind of discovery.

The average user might never have noticed that but when it was killed off the www stopped being a community.

We can pretend the index is still there. If you cant get to it it's s much like the llm content.

pierrefermat1|1 year ago

It seems like in their mind the snake ate it's tail without realising, so they don't even know what's outside of Google's index.

rychco|1 year ago

I’ve been a happy Kagi customer for probably 2 years(?) now. Highlights for me, as a professional:

- I can blacklist low-value domains (such as geeksforgeeks) that dominate the top of many programming searches.

- I can increase/decrease the priority of domains or pin domains to the top of searches, such as official documentation for languages/libraries.

- I can use “Lenses” to filter results for programming/academic/forum results.

abcdefg12|1 year ago

The big question is wtf google isn’t doing the same? Instead they keep removing features and dumbing down their product

baggachipz|1 year ago

This essentially advocates for the same thing defined in Cory Doctorow's The Internet Con: How to Seize the Means of Computation. That point being, requiring open protocols from big tech will enable competition and innovation. This will return the creative inspiration to the technologists. I completely agree with it, and I hope we are reaching an inflection point where walled data gardens are cracked open.

chrisweekly|1 year ago

I switched from Google to DDG for default search a few years ago, and then to Kagi maybe 18 months ago. Kagi's simply excellent.

andrewstuart|1 year ago

Who knows what websites and pages are even on the web?

There's no index to the web that I know of apart from Google and DuckDuckGo and maybe this Kagi thing.

I want to explore the web - surely search isn't the only way to use the web?

I imagine it could be fun to explore the web, lists and graphs of interest where I can hop from here to there via list of links or graphs or nodes or something?

Does anyone know of anything like this?

1vuio0pswjnm7|1 year ago

There have been some websites in the past that allowed one to browse www content by IP address, covering what seemed to be the full range of IPv4 address space. For example, a page with a list of IP address ranges where each address range is a hyperlink. One could then drill down by following hyperlinks to a specific IP address and view whatever was hosted at that address (default host in the case of virtual hosting). Not sure why these websites do not persist. Quite useful. IMHO.

DNS zone files are a decent starting point for exploring the web. Not every registered domain name has an associated website but most do. The largest zone files are available to the public for free.

AndroidKitKat|1 year ago

While not quite what you're looking for, Kagi has a "Small Web" feed of sites that are semi-curated blogs. [1][2] I don't know how often it is updated, but I like to poke around every now and then see what's going on in people's corners of the internet.

[1] - https://blog.kagi.com/small-web [2] - https://kagi.com/smallweb

ColinHayhurst|1 year ago

Google indexes widely. DuckDuckGo and Kagi have small specialised indexes and as such rely on the larger indexes like Google, Bing and Mojeek. DuckDuckGo used to use Yandex. More information here: https://www.searchenginemap.com/

firecall|1 year ago

Webrings?

There was an attempt to push a return of Webrings I think...

But funnily enough, whilst browsing this thread, https://news.ycombinator.com/item?id=41389642

I commented to my colleagues:

Remember when people would find good websites and share them!

madrox|1 year ago

I think Kagi is correct and that the way we explore information on the internet will look very different in X years with all the changes LLMs will bring. I think the real question will be what will it look like.

I don't think it looks like search today. Google got where they were because they were 10x better than everything else and had an experience focusing on what mattered at the time. I don't think the 10x experience will look like ten blue links. I don't know what that next experience is, but I'll know it when I see it.

icar|1 year ago

I wish Kagi was cheaper. It's expensive in many countries and it doesn't help that we are bombarded with subscriptions everywhere.

bugtodiffer|1 year ago

I wish Kagi would use the money to build a search engine.

But instead they use it to build a browser no one wants and an email service no one needs.

Yet their search website is still broken here and there...

poikroequ|1 year ago

> Apple has stated that Bing does not match Google’s search result quality, and they are unwilling to compromise on user experience by offering subpar results.

I wouldn't take this statement at face value. This is most likely a BS PR excuse for Apple to maintain their current deal with Google. I wouldn't expect anything less from any large corporation looking to protect $20 billion in annual revenue.

jpc0|1 year ago

Universally every single time I've used bing, including this week. My response was to scroll through the entire first oage of results, swear, open google and with the exact same search params what I wanted was the first result.

Bing is objectively worse.

skinkestek|1 year ago

Last I tried at least:

- Current Google was bad compared to original Google (it ignores my keywords, even if I use doublequotes and verbatim)

- DuckDuckGo and Bing managed to be worse

- Kagi is like old Google

moonlion_eth|1 year ago

Paying kagi user. All day every day

freefaler|1 year ago

I've used Kagi for a year and it was better than Google for most of the searches.

However I think the model will be changed to something more like Perplexity.ai

I've switched to Perplexity and for most of the searches it works better than Kagi.

They'd need to add something like this to survive in the long run, because for exploratory searches tools like Perplexity are really good.

Terretta|1 year ago

Perplexity lets a firm switch on SSO and give perplexity to employees without a big barrier to entry. So, we bought it for our employees and if they use it great, if they don't great. Even though we're a small startup, this is true of almost any SaaS we find that lets us control the login to stay regulation compliant. If you like it, and show how it helps your day, and the SaaS let us control the login, we'll "just pay for it".

We will not, however, pay some four or five figure SSO tax for every SaaS. We'd be bankrupted.

Kagi should do this, or at least enable domain-specific OIDC/Oauth2 — the ubiquitous "Sign in with" or "Continue with" buttons like http://xsplit.com/user/auth or http://id.atlassian.com/login since MSFT and Google accounts hit almost all businesses — and then just bill the same as the individual pro price + usage pricing.

As it stands, we reimburse employees who buy Kagi individually, this costs us more than the cost of Kagi, and means it's only one off.

PS. Don't get me started on the MacOS and iOS apps that have no retail price version available. Apple provides no way for a firm to provide employees with IAP subscription apps whether BYOD or managed devices. We can, and do, provide any retail priced app for both BYOD and MDM. It shows up in a catalog on the device, people install it, you get a retail sale. Thank you to those devs who make a retail version available, even if its 2x - 5x the annual cost. Empowering employees with apps is a no brainer if devs just let firms pay them to do it.

aDyslecticCrow|1 year ago

AI is expensive to host, and I don't trust their puke. I'd rather get a classic result page. Kagi is great.

blackeyeblitzar|1 year ago

As someone who hasn’t used paid search services, what sort of problems has this solved for other HN users that make it worth it? How does it compare to “AI” based search tools like Perplexity?

pbronez|1 year ago

For me, Kagi is super fast and provides high quality, customizable results. Web search Just Works. When I use a new computer/browser and accidentally search with Google, it is a viscerally unpleasant surprise.

As for perplexity - I got a free year of Perplexity because I bought a Rabbit R1. I tried it, wasn’t impressed. I use Kagi’s AI assistant all the time. It’s my primary way of getting information from the web. I just type a a free form question into my address bar, append !expert for general questions or !code for technical ones, and seconds later my question is answered and I’m back to work.

nucleardog|1 year ago

I use the internet basically as an extension of my brain. There's very little barrier between having a thought and seeking information. Search is the usual way those two come together.

Google, these days, seems to mostly ignore whatever I've tried to search for and instead return results that I'd call "more popular". So the top results are mostly generic, useless results and below that it's mostly blog spam or wildly unrelated things.

This is especially bad when I'm looking for specific technical documentation or trying to understand unusual or obscure problems. Usually it's returning nothing useful at all.

Kagi returns results actually related to what I'm looking for more often than not.

The thing that convinced me to pay for it was a single search. I kept hitting _something_ that was causing the Apple TV to stop showing how much time was remaining in a show and instead show something else.

I went to Kagi and searched "netflix apple TV showing wrong time remaining" (my incomplete but best understanding of the problem at the time). Kagi surfaced a result that explained what this was and how it was getting triggered as the fourth result.

I went back to Google and searched the same. Top result was "If you can't change the time or time zone on your Apple Device" from Apple. Second was "Netflix audio is out of sync" from Netflix. With the benefit of knowing what the answer was, I did find a single relevant result about 25 results down mixed in with some blog spam on removing a show from "Continue Watching" on Disney Plus, listicles on hidden ways to make the Apple TV app on your phone even better, and a link to a Google Books copy of a 2008 Men's Health Magazine (?!).

Every time I accidentally end up back on Google it's... jarring to say the least.

endisneigh|1 year ago

A smaller competitor wants and advocates for itself. Makes sense but is it really surprising? It would be strange otherwise.

I do wonder how far one can get charging for search.

mediumsmart|1 year ago

Now it dawns on me that in this new era I can write an article for the public good and by balancing the wording I am able to place it before the 3852 competing public good articles on the search results page. lets innovate

abtinf|1 year ago

This is an extremely disappointing post. In the past, I’ve enthusiastically supported and advocated for Kagi.

But Kagi advocating for using force to destroy its competitors is completely unacceptable to me and an admission that they do not believe they have a viable product.

Antitrust law is arbitrary and evil. If you make more money than your competitors, you have undue market power. If you price below your competitors, you are dumping. If you price the same as your competitors, you are colluding. The whole thing is a naked power grab by politicians and inferior companies.

This is a sad day. Kagi is the best thing that’s happened to the internet in the past decade. And now I have to stop my auto renew.

kingstoned|1 year ago

I don't see how Kagi are advocating to destroy Google and especially competitors. They write about potential actions government might do to make the field more competitive.

Keep in mind that it's the Google that was government funded when it got started via NSF and university system. Also, a ton of subsidies, tax breaks for their subsidiaries, direct payments from government (e.g. google cloud gov contracts, military recruitment ads on youtube...).

baggachipz|1 year ago

They're not advocating for the destruction of competitors. They're arguing FOR competitors. Google has already been ruled an illegal monopoly, and now it's time to figure out what to do about it. Kagi is saying that rather than split up products, require the protocols to be open and usable for all. That's it.

freediver|1 year ago

> But Kagi advocating for using force to destroy its competitors is completely unacceptable to me

Everyone is entitled to their own interpretation, but that is not what the article advocates at all. The article is about what is best for the user given the circumstances, where all other proposed remedies have focused on how to hurt Google, which article argues to be counter-productive.

The ruling has already been made and a remedy will be chosen whether we agree with the ruling or not - so which one is the best for the users? The solution that is proposed in the article would actually mean increased competition in the space, including to Kagi.

Destiner|1 year ago

i feel bad for kagi/ddg

they are still trying to fight the google by building pretty much the same product

while perplexity is obviously in the lead by being ai-first

freediver|1 year ago

For many people (me included) being AI-first is a bug, not a feature :)