Ask HN: Are we sure LLMs are that useful in a web search application?

[+] agentultra|3 years ago|reply

I think there are plenty of people that remain skeptical of their utility for this application.

People who want to get rich will tell you it's the next greatest thing that will revolutionize the industry.

Personally, I've been annoyed at how confidently wrong ChatGPT can be. Even when you point out the error and ask it to correct the mistake it comes back with an even-more-wrong answer. And it frames it like the answer is completely, 100% correct and accurate. Because it's essentially really deep auto-complete, it's designed to generate text that sounds plausible. This isn't useful in a search context when you want to find sources and truth.

I think there are useful applications for this technology but I think we should leave that to the people who understand LLM's best and keep the charlatans out of it. LLM's are really interesting and have come a long way by leaps and bounds... but I don't see how replacing entire institutions and processes by something that is only well understood by a handful of people is a great idea. It's like watering plants with gatorade.

[+] rvz|3 years ago|reply

> People who want to get rich will tell you it's the next greatest thing that will revolutionize the industry.

Reminds me of the web3 crypto hype and this hype is the same thing that happened with GPT-3. Closed source black-box AI model hidden behind a SaaS confidently generating wrong answers as the truth.

Sounds like OpenAI (effectively Microsoft's new AI division) are selling snake-oil again.

> I think there are useful applications for this technology but I think we should leave that to the people who understand LLM's best and keep the charlatans out of it.

There are certainly charlatans, grifters and snake-oil sales people pretending and hyping about their so-called 'AI startup' when it is actually uses the GPT-3 or ChatGPT API. Another emperor has no clothes confidence trick generating garbage.

Given that ChatGPT cannot explain its own answers transparently when questioned especially when it confidently generates wrong answers means that you cannot trust its output and telling it to give you a sophisticated answer to an 'un-googleable' question is where you see it clearly trip over.

What would really 'revolutionize the industry' is an open-source LLM model which is a smaller model and more transparent than ChatGPT, like what Stable Diffusion did to DALLE-2.

[+] bbor|3 years ago|reply

This is a perfect example of the popular view on here, and in my humble naive opinion it’s completely mistaken. The point isn’t “can an LLM replace google” the point is “can robots that can speak English and use logic improve the search experience” which I think basically everyone would answer “yes” to. Complaining that it gets stuff wrong when not hooked up to a web of resources to cite is, IMO, completely missing the point.

Also OP (not so much you) is way too caught up in the “chat” aspect - that is the first exciting UX that got popular, but these are much, much more than chatbots. Pretending that they’re human/conscious/in a conversation is fun, but having an invisible partner that knows you and tailors your search results… that’s powerful.

For example, you’ll never have to add “Reddit” again, or at least you’ll only have to tell it once. An LLM can easily identify the kind of questions where you want forum posts, read thousands of posts in a second, summarize all their content, and label each link with other information that helps you decide which threads to read in full.

I can’t wait!

[+] marban|3 years ago|reply

People who want to get rich will tell you it's the next greatest thing that will revolutionize the industry.

The nice way of saying grifters.

[+] shanebellone|3 years ago|reply

"it's essentially really deep auto-complete"

This is the most accurate description I've even read.

[+] 725686|3 years ago|reply

"Personally, I've been annoyed at how confidently wrong ChatGPT can be. Even when you point out the error and ask it to correct the mistake it comes back with an even-more-wrong answer"

That also happens with real people so... A web search also returns wrong answers, because it is not magic. It just searches through all the garbage out there. You just have to be aware of its flaws and limitations... as you do with you fellow humans.

[+] eptcyka|3 years ago|reply

It's not designed to generate plausible sounding text, it's designed to produce text that is statistically likely to be sound. It's a markov chain, but there might be multiple roots and some backpropped probabilities for which leafs should be picked. The statistical significance doesn't have to track with logic.

[+] AutoDunkGPT|3 years ago|reply

so what you're saying is, that it's my time to shine, right?

[+] visarga|3 years ago|reply

On Google when I search for "X that does not Y" it would often return "X that does Y" instead. Hard to force Google to respect the query intent. LLMs on the other hand process intent very well, but they hallucinate. So the obvious solution is a combination of them.

Ideally Google search would have a flag to "follow my intent to the letter" and return empty if nothing is found. When you are searching for a specific thing, a response with other things feels like bullshit, Google trying to milk more clicks wasting your time. I don't mean exact phrase search, I mean exact semantic search.

This is causing issues when searching for bug fixes by ignoring the exact version, when shopping it will ignore some of your filters and respond with bad product suggestions, and when searching something specific that looks like some other popular keyword, it will give you the popular one, as if if has an exclusion zone and you cannot search for anything else around it.

"Minimum weight of a scooter with back suspension" -> matches information about carrying capacity. Of course more people discuss about max passenger weight than minimum scooter weight, but I really don't care about the other one.

[+] kurthr|3 years ago|reply

I feel like what I want is a LLM that can tell me not only a summary answer, but what search terms are most likely to return documentation for the answer. Provide clickable citations documenting the answer using those terms (it can iterate on the answer and terms if they aren't internally consistent). Then give summaries of the information at each citation and/or parameters to input if the response requires further operations/searches. It could provide some commentary of the quality/recency of those citation/sources as well.

That makes the response more current than the last LLM crawl&tune, allows a further directed search based on the new parameters, and provides a traceable path to sources and citations. If it finds that there's not a Python library with that name then iterate.

Basically, an LLM should have a very good idea what good search terms are for the topic, and where to find the information, whereas I might not know the acronyms, jargon, or related fields and optimal answers.

Yes, this is getting pretty close to writing an 8th grade class report that covers anything on the web. That's about where these seem to be.

*ps I copied this from a post I made yesterday*

[+] syats|3 years ago|reply

Yes, but not in the form of chatbots.

Among other things, a LLM can be seen as a store which you query and get results from. A chatbot is cute because it formats output text to look like conversation, and the recent applications are nice because the query (now known as prompt) can be complicated and long, and can influence the format and length of the results.

But the cool stuff is being able to link the relatively small amount of text you input as a query, into many other chunks of texts that are semantically similar (waves hands around like helicopter blades). So, an LLM is a sort of "knowledge" store, that can be used for expanding queries, and search results, to make it more likely that a good result seems similar to the input query.

What do I mean by similar? well, the first iteration of this idea is vector similarity (e.g. https://github.com/facebookresearch/DPR). The second iteration is to store the results into the model itself, so that the search operation is performed by the model itself.

This second iteration will lead, IMHO, to a different sort of search engine. Not one over "all the pages" as, in theory at least, google and the like currently work. Instead, it will be restricted to the "well learnt pages", those which, because of volume of repetition, structure of text, or just availability to the training algorithm, get picked up and encoded into the weights.

To make an analogy, is like asking a human who are the Knights of the Round Table and getting back the usual "Percival, Lanceelot and Galahad", but just because the other thousand knights mentioned in some works are not popular enough for that given human to know them.

This is a different sort of search engine than we are used to, one which might be more useful for many (most?) applications. The biases and dangers of it are things we are only starting to imagine.

[+] basch|3 years ago|reply

Exactly. Unfortunately, I think the "chat" aspect is obscuring what is actually happening here, and distracting from the achievement.

First, the human input is extremely flexible, but can include instructions. It is natural language programming.

Second, the "conversation" has state. I can give an instruction, and then a followup instruction that adds to the first instruction. Someday down the road there will be two states, your account state (instructions you taught it that it retains as long as you are logged in. Maybe my account can have multiple state buckets/buildings I can enter, one of one set of rules, one for another. Could call them programs or routines. (computer execute study routine)) and temporary state (instructions it retains only for the duration of the conversation/search.)

The exciting part here is being able to query data and manipulate it in memory. Making a search, refining the search, redirecting the search in a different direction when its not working. That collaborative, iterative type search doesnt really exist at the moment. I cant tell google "the results you just returned are garbage, here is why, try again."

It is more like a fuzzy commandline. The chatbotness is just a layer of cute on top, that isnt completely necessary.

[+] softwaredoug|3 years ago|reply

As someone that works in search - TBH - we don't really know. And the point isn't that people "know" its more about hedging that there likely is some kind of use case where it becomes important. Nobody knows until they try something. You're right to be skeptical.

However, it's been observed, people are using chatbots for informational searches. The kinds of searches where you want to learn about a specific fact. This isn't all searches, but it's an important subset for web search. For better or worse, with probably a high degree of inaccuracy, people (probably rightly) perceive this is how people will seek information.

There's also the generational use cases - "write me a program that does X". Is this something people would use a search bar for? We don't know, and wouldn't know, until its out there for a while.

For the longest time the one natural language interface was as a search bar. So search vendors surmise it's important to both defend their turf while also a natural way to get regular users familiar with this kind of informational interaction...

[+] gremlinsinc|3 years ago|reply

I find myself using google less and less, all search -really. I messed up my desktop the other day configuring and trying different DE's and window managers, chatGPT fixed some of my issues that would've had me in archlinux forums and reddit for a couple hours probably.

Instead it was all good in 20 minutes.

Personally, I think Microsoft has a HUGE opportunity here, imagine if the OpenAI - PRO account was actually free -- but only on edge browsers, or inside bing.com. It'd be a bit of a loss leader for a bit, because a lot of compute is required but it'd also help unseat google. Google's only chance is if lamda/bard is better - and so much better it's noticable from day one.

Their search results are so horrible lately I don't know anybody who isn't 'shopping around' right now for something better. Bing, DDG, Brave, Neeva, You.com, etc.

[+] trieste92|3 years ago|reply

> There's also the generational use cases - "write me a program that does X". Is this something people would use a search bar for? We don't know, and wouldn't know, until its out there for a while.

Oh God. I seriously, seriously hope not. I can imagine some coasters I work with shitting out source code doing this

Anyone who trusts an LLM and is committing to a code base that other people work on - I would never want to work with someone with this attitude. Physically writing code takes far less time than understanding code

[+] ergonaught|3 years ago|reply

It's all going to go horribly wrong because people, but, yes:

1) Integration with voice assistants. Links/sources are irrelevant.

2) Models tuned against a particular body of work don't care if links go stale, or websites get SlashdottedHackerNewshuggedDDOSed, or etc. Links/sources are irrelevant.

3) Inbound "service requests" processed by something that can better understand the question and the available answers/solutions. Links don't matter much.

4) When "Okay what are some good websites to read more about this?" can be answered, too, bang.

5) Ever asked somebody a question and just rolled with their answer instead of demanding citations? I mean, you're doing it here. So, again, yes.

[+] verdverm|3 years ago|reply

> 4) ...

ChatGPT will already do this, when you get an answer you want to learn more about, ask

"where can I learn more about this?"

"can you provide links?"

there are a number of prompts that work, you can even provide it with an AND clause in the first prompt

[+] anonyfox|3 years ago|reply

Sometimes I don't really know what exactly to type into google because its only something rough in my head. It then takes like multiple try&research iterations on the topic I have in mind until being able to formulate the actual question, at least if I wasn't derailed in the process (monkey brain). A Chatbot is a godsend for me here and I will happily pay money for that alone.

Another point is that I am either creative or productive at a time, but never both... at least aware of which state I am in. ChatGPT has proven to take over the other part surprisingly good. ie:

- when I am in a productive mood and stumble upon a thinking problem, generative AI is like on-the-spot creativity for "good enough" solutions, like naming a programming thingy or write some filler text around a few keywords instead of me looking for words.

- when I am in a creative mindset, I increasingly feed some code snippets into the bot and ask some questions to "fill in the gaps", like writing a specific function using library X, then to write a documentation explaining how it works, then to also write some unittests, and sometimes I even derail a bit or let the bot explain parts that stand out in some way so I can maybe learn a trick.

... And i used ChatGPT already in kinda emergency situations, like when I know 5 minutes in advance that I have to speak in front of a crowd/in a meeting and it gave me extremely useful outlines to quickly adapt to even when in a panicked mind state - calming me down through a given structure that sounded okay-ish, and it doesn't matter if the response is right or wrong.

[+] PaulHoule|3 years ago|reply

Depends how you use them.

Generating bullshit text off the cuff is not the only use of LLMs. LLMs can perform very well at classification, regression, ranking, coloring proper names red, and other tasks. You could, for instance, use LLMs to encode a query and documents and rank them with a siamese network, something not too different for how a conventional search engine works.

If there is one thing wrong with the current crop of LLMs it is that these can only attend over a limited number of tokens. BERT can attend over 512 tokens, ChatGPT over 4096, where a token is shorter than a word on average. It easy to process the headline of an HN submission with BERT, but I classify a few hundred abstracts of scientific papers a day. A long abstract is about 1800 words which is too much for longformer but would fit in ChatGPT if there aren't too many $10 words.

Unless you can recast a problem as "does this document have a short window in it with this attribute?" (maybe "did the patient die?" in a clinical case report or "who won the game?" in a sports article) there is no way to cut a document up into pieces and feed it into an LLM, then combine the output vectors in a way that doesn't break the "magic" behavior the LLM was trained to do.

You'd imagine ChatGPT would produce accurate results if you could tell it "Write an review of topic T with citations" but if you try that you'll find it will write citations that look for real but don't actually exist if you look them up. You'd imagine at minimum that such a system would have to read the papers that it cites, maybe being able to attend over all of them at the same time which would take an attention window 100-1000x larger.

That's by no means cheap and it might be Kryptonite for Google in that Google's model involves indexing a huge amount of low quality content and financing it by ads that are a penny a click. A business or individual might get a larger amount of value per query out of a much smaller task-oriented document set.

[+] h2odragon|3 years ago|reply

When you need an introduction to the subject area as much as you needed specific information from it; then an LLM explaining terms and offering options for further exploration can be a nice thing. (I assume they'll use it that way...)

When you're hunting for a particular fact, like "that bit of code i half remember seeing on a page 15 years ago", then I don't see anything for an LLM to add. Google had a pretty good index for that purpose about 15 years ago, but they've chosen to prioritize other goals since then. I dunno if anyone works "find things you're searching for" as a market now.

Which is an answer to your question: Does it matter if an LLM helps search the web? That's not what people are doing, that's not what these companies are selling.

[+] unknown|3 years ago|reply

[deleted]

[+] fnordpiglet|3 years ago|reply

They don’t solve the web search problem better than web search engines do. But most people aren’t interested in a document retrieval exercise, they have a question and want an answer to it. The interface of posing a cryptic phrase to find a collection of “relevant” text blobs that you then have to sift and read to try to piece together an answer isn’t ideal for answering a question. When we have a question of a professor they sometimes answer with a list of papers, sometimes with a long form answer, and sometimes both. I think all three are useful, depending on the context. But the last one is clearly the most useful. It provides an intuition up front and an opportunity ask clarifying questions, as well as a way to find more detailed information and understanding from a known credible list.

LLMs have a chance to offer an oracle that doesn’t answer in cryptic or evasive ways, but attempts to just give an answer. The hallucinations are a huge flaw, but one that I’m confident will be addressed with other non-LLM AI approaches. But it’s the right user interface for answering questions - it answers questions with answers, not a pile of potentially relevant documents to sort. The augmentation with citations, especially if they’re semantically revenant rather than symbolically, is a huge plus.

[+] fauigerzigerk|3 years ago|reply

I've been using YouChat with some success. It always gives you citations in addition to its own answer.

https://you.com/search?q=youchat&tbm=youchat

[+] remarkEon|3 years ago|reply

I just used it this morning to create meal plans, including grocery lists and detailed cooking instructions, for the next 2 weeks. What would've taken probably 30+ minutes - googling around, scrolling past the blogspam nonsense, writing everything down - took about 10. This morning was probably the first time I started to understand the utility of this thing. In a way, it's like I'm finally interacting with the computer on the USS Enterprise.

I don't know if that strictly complies with your definition of "web search application". It's definitely going to save time for me, and not seeing a bunch of ads during the process is wonderful - to the point that I really could see myself paying for it if they decide to go that route and take away the "free" version.

[+] JohnFen|3 years ago|reply

I'm beginning to understand that there are many definitions of "search engine", and that has caused much confusion to me. I would say that the use case you describe here is nothing even close to the sort of problem that a search engine addresses. It's more of a knowledge engine thing (and even that's a poor fit, really).

When I think "search engine", I think of something that searches for web pages. What I want as a result isn't an answer, meal plan, grocery list, or any of that higher-order stuff. I want a list of websites.

This reinforces my fear that this sort of approach might kill actual search engines. That would be a pretty serious loss to me, and a further reduction in the usefulness of the web. But saying that is not disparaging things like ChatGPT at all. They address a different, and valid, need.

I just hope we don't lose an important tool in the process.

[+] sologoub|3 years ago|reply

Would you be open to posting either a screenshot or the inputs you used to generate these meal plans?

[+] iamwpj|3 years ago|reply

How does this work? Care to share the prompt and output?

[+] thesuperbigfrog|3 years ago|reply

Years ago there was an idea called "the semantic web": https://en.wikipedia.org/wiki/Semantic_Web

The basic idea was to have enough metadata about web sites so that you could get programs to do something approaching Prolog-style reasoning about the content and meaning of the web pages.

With more advanced LLMs it looks like a slightly different approach to achive something like the semantic web idea.

I think the idea is to constantly feed the model with updates from crawling the web and have the LLM "digest" the content, apply some filters to remove bad stuff, and then provide a meaningful result to whatever queries it might be asked.

[+] PaulHoule|3 years ago|reply

I'd picture that as a hybrid system that uses some old AI ideas together with some new AI ideas. AlphaGo which combines a neural network-based player with a MCMS player that simulates a large number of games all the way to the end, is a good example.

The intellectual north star of this is

https://en.wikipedia.org/wiki/SAT_solver

and might be a better model than the "semantic web" in that the semantic web has revolved around description logics like OWL that implement a limited subset of first-order logic that is always decidable. One trouble is that OWL-based systems can't handle arithmetic at all, given a height and a weight you can write a rule that says anybody over 6 feet is tall but you can't say a person with a BMI over 30 is obese because OWL can't add, multiply and divide. But something a little more expressive that searches efficiently in practice because it gets "hunches" from a neural net could be useful but it will not evade the limits on computation and logic that Godel, Tarski and Turing warned you about.

[+] dboreham|3 years ago|reply

> Years ago there was an idea called...

That idea was the basis for the current generation of Google search (see: Metaweb).

[+] louthy|3 years ago|reply

I think the summarisers are going to be valuable. As a Kagi subscriber I'm looking forward to them integrating their AI labs demo that was showcased recently. There's certainly potential for search to become an order of magnitude better over the next few years.

The issue I see with the chat approach is trust. I've seen so many examples of these models just making shit up now that I reckon regular use of them will eventually lead to mistrust between the human and the chat-bot. If you can't trust the answers and have to go and check yourself, it's dead as an idea IMHO.

[+] Sateeshm|3 years ago|reply

AI summarising articles written by AI. Productive use of everyone's time.

[+] stevenbedrick|3 years ago|reply

There was a great paper at CHIIR looking at this question from an information science perspective that I would definitely recommend anybody working or interested in this space read; here's the abstract:

> Search systems, like many other applications of machine learning, have become increasingly complex and opaque. The notions of relevance, usefulness, and trustworthiness with respect to information were already overloaded and often difficult to articulate, study, or implement. Newly surfaced proposals that aim to use large language models to generate relevant information for a user’s needs pose even greater threat to transparency, provenance, and user interactions in a search system. In this perspective paper we revisit the problem of search in the larger context of information seeking and argue that removing or reducing interactions in an effort to retrieve presumably more relevant information can be detrimental to many fundamental aspects of search, including information verification, information literacy, and serendipity. In addition to providing suggestions for counteracting some of the potential problems posed by such models, we present a vision for search systems that are intelligent and effective, while also providing greater transparency and accountability.

Shah & Bender (2022) "Situating Search". In Proc. CHIIR '22 https://dl.acm.org/doi/abs/10.1145/3498366.3505816

[+] karaterobot|3 years ago|reply

I'm fairly sure it can be a useful part of a web search application. There are things it could aid in: summary, evaluation of results, and probably many other things I can't imagine. It's got some use. But I'm not at all sure an LLM could wholly replace search engines, as some of the headlines are proclaiming. I think if you just took the excitement in the median news article and dialed it back about half a turn, that's probably where we'll end up.

[+] ESMirro|3 years ago|reply

Agreed, we’ve been here before, repeatedly, over the past few years:

- The Metaverse was going to change the internet.

- The Internet of things was going to revolutionise our homes and cities.

- Self-driving cars would change the motor industry.

- Crypto is the next big thing in finance.

- NFTs are going to revolutionise digital ownership.

Most of those technologies have introduced niche or useful applications, but the ridiculous, breathless hype about how this technology is the one to change everything is getting more predictable and increasingly frustrating.

[+] foobarbecue|3 years ago|reply

A major thing I find lacking in search engines is a way to disambiguate between homographs. "AI" search engines are one way to implement that capability. However, I want a search engine that gives me websites as a result, and is not capable of lying to me. So far every single "AI" search I have tried has given me incorrect, invented answers. This is a big step in the wrong direction.

[+] esperent|3 years ago|reply

Unlike normal search engines which give you SEO tuned spam about 50% of the time. Are you sure it's a step in the wrong direction?

> every single "AI" search I have tried has given me incorrect, invented answers.

Really? Every single one? Not a single AI (ChatGPT, presumably) query you've tried has given you anything except incorrect information? What kind of things have you been querying for?

[+] ankit219|3 years ago|reply

I think LLMs are potentially a good addition. My searches are based on different use cases: one is when I am looking for a quick answer, two when I am looking for some suggestions, three when I am looking to do research. ( Of course ignoring the ones when I google for meanings or quick calculations or to go to a specific site. Thats just cos it's fast, not really a feature).

For quick overview answer, LLMs are great. It's not 100% correct, but mostly it is, and that is good enough for a quick answer. Currently google tries to show that and people object as it is stealing traffic from websites. i just need an answer, a coherent useable one. Eg: "What were the movies Scorsese got an Oscar nomination for?"

For suggestions, LLMs are just one more of those blogs and listicles that are already showing up in search. If LLM is updated that is. The difference is an LLM would customize the answer according to the query, unlike already pre written content. So, yes useful. Same goes with stuff like: "how to build an email list?" or "What is a effective sales strategy?"

For research, Google is more useful. I think we have all done that.

Another application which is not realized at this time because we never did it before is the ability to ask follow up questions (which a chat format enables well). Suppose you get an overview of how a quantum computer works, but it would take a lot of effort to ask a follow up question and get a direct answer via a search engine. Eg: "Why is there no point in going beyond thousand Qubits?"

There could be modifications like voice to text (a jarvis like interface), or a personal assistant thingy. But those are far fetched.

It will help immensely, and for places it does not, we will still google like we have done before.

[+] graypegg|3 years ago|reply

It does make me wonder if google is maybe going in the wrong direction with its knee-jerk reaction to Bing’s changes recently.

While what they’re doing currently isn’t perfect, it does provide results that are at least traceable. I could imagine an alternate universe where they doubled down on marketing themselves as “the search engine that doesn’t lie to you” or “where answers are found, not stories”.

[+] brookst|3 years ago|reply

The Bing Chat answers are traceable. Look at the demo.

[+] mtlmtlmtlmtl|3 years ago|reply

On average I think I would be far better served if search providers just fixed their operators.

[+] danso|3 years ago|reply

Completely agree with this skepticism. It reminds me of when people claim that TikTok is now a "search engine", as if entertaining/slick videos (even if they're reliably surfaced via search query) are going to be more useful on the whole than information I can skim and read [0].

On the other hand, I do agree with people speculating that LLM-AI interfaces will seriously hurt Google's bottom line, e.g. reducing the space for search ads, which represent the majority of its revenue.

[0] https://www.nytimes.com/2022/09/16/technology/gen-z-tiktok-s...

[+] busyant|3 years ago|reply

I'd like to see these things (eventually) used for automated phone systems like CVS and my bank.

Mostly this is just to calm me down because ChatGPT gives me the illusion that I'm interacting with a human. The current voice systems are infuriatingly bad.

It would be nice if CVS's phone system would actually listen to me and modify its output accordingly. "I already gave you my birth date. And NO, I don't need a COVID booster."

edit: I'd like to meet the person who sold CVS its prescription web-site and its voice system. Simply to marvel at them and the swindle they pulled off, delivering absolute trash and probably walking away with a king's ransom.

[+] ramesh31|3 years ago|reply

>The current voice systems are infuriatingly bad.

But at least they're deterministic and finite. I imagine ChatGPT like results from a phone system would be even worse. LLM's are incredible when you can massage the answer you want out of it in a feedback loop, but not so much for automated systems.

161 comments