Ask HN: Are we sure LLMs are that useful in a web search application?
105 points| duringmath | 3 years ago | reply
Instant answers or whatever they're called already produce direct answers plus they cite sources and provide links which is what everyone seems to think is the solution to the "LLMs make stuff up" problem.
Not to mention they're faster and cheaper to run.
Only truly practical use case I can think of is summarizing articles or writing them which makes more sense as a word processor or browser add-ons
[+] [-] agentultra|3 years ago|reply
People who want to get rich will tell you it's the next greatest thing that will revolutionize the industry.
Personally, I've been annoyed at how confidently wrong ChatGPT can be. Even when you point out the error and ask it to correct the mistake it comes back with an even-more-wrong answer. And it frames it like the answer is completely, 100% correct and accurate. Because it's essentially really deep auto-complete, it's designed to generate text that sounds plausible. This isn't useful in a search context when you want to find sources and truth.
I think there are useful applications for this technology but I think we should leave that to the people who understand LLM's best and keep the charlatans out of it. LLM's are really interesting and have come a long way by leaps and bounds... but I don't see how replacing entire institutions and processes by something that is only well understood by a handful of people is a great idea. It's like watering plants with gatorade.
[+] [-] rvz|3 years ago|reply
Reminds me of the web3 crypto hype and this hype is the same thing that happened with GPT-3. Closed source black-box AI model hidden behind a SaaS confidently generating wrong answers as the truth.
Sounds like OpenAI (effectively Microsoft's new AI division) are selling snake-oil again.
> I think there are useful applications for this technology but I think we should leave that to the people who understand LLM's best and keep the charlatans out of it.
There are certainly charlatans, grifters and snake-oil sales people pretending and hyping about their so-called 'AI startup' when it is actually uses the GPT-3 or ChatGPT API. Another emperor has no clothes confidence trick generating garbage.
Given that ChatGPT cannot explain its own answers transparently when questioned especially when it confidently generates wrong answers means that you cannot trust its output and telling it to give you a sophisticated answer to an 'un-googleable' question is where you see it clearly trip over.
What would really 'revolutionize the industry' is an open-source LLM model which is a smaller model and more transparent than ChatGPT, like what Stable Diffusion did to DALLE-2.
[+] [-] bbor|3 years ago|reply
Also OP (not so much you) is way too caught up in the “chat” aspect - that is the first exciting UX that got popular, but these are much, much more than chatbots. Pretending that they’re human/conscious/in a conversation is fun, but having an invisible partner that knows you and tailors your search results… that’s powerful.
For example, you’ll never have to add “Reddit” again, or at least you’ll only have to tell it once. An LLM can easily identify the kind of questions where you want forum posts, read thousands of posts in a second, summarize all their content, and label each link with other information that helps you decide which threads to read in full.
I can’t wait!
[+] [-] marban|3 years ago|reply
The nice way of saying grifters.
[+] [-] shanebellone|3 years ago|reply
This is the most accurate description I've even read.
[+] [-] 725686|3 years ago|reply
That also happens with real people so... A web search also returns wrong answers, because it is not magic. It just searches through all the garbage out there. You just have to be aware of its flaws and limitations... as you do with you fellow humans.
[+] [-] eptcyka|3 years ago|reply
[+] [-] AutoDunkGPT|3 years ago|reply
[+] [-] visarga|3 years ago|reply
Ideally Google search would have a flag to "follow my intent to the letter" and return empty if nothing is found. When you are searching for a specific thing, a response with other things feels like bullshit, Google trying to milk more clicks wasting your time. I don't mean exact phrase search, I mean exact semantic search.
This is causing issues when searching for bug fixes by ignoring the exact version, when shopping it will ignore some of your filters and respond with bad product suggestions, and when searching something specific that looks like some other popular keyword, it will give you the popular one, as if if has an exclusion zone and you cannot search for anything else around it.
"Minimum weight of a scooter with back suspension" -> matches information about carrying capacity. Of course more people discuss about max passenger weight than minimum scooter weight, but I really don't care about the other one.
[+] [-] kurthr|3 years ago|reply
That makes the response more current than the last LLM crawl&tune, allows a further directed search based on the new parameters, and provides a traceable path to sources and citations. If it finds that there's not a Python library with that name then iterate.
Basically, an LLM should have a very good idea what good search terms are for the topic, and where to find the information, whereas I might not know the acronyms, jargon, or related fields and optimal answers.
Yes, this is getting pretty close to writing an 8th grade class report that covers anything on the web. That's about where these seem to be.
*ps I copied this from a post I made yesterday*
[+] [-] syats|3 years ago|reply
Among other things, a LLM can be seen as a store which you query and get results from. A chatbot is cute because it formats output text to look like conversation, and the recent applications are nice because the query (now known as prompt) can be complicated and long, and can influence the format and length of the results.
But the cool stuff is being able to link the relatively small amount of text you input as a query, into many other chunks of texts that are semantically similar (waves hands around like helicopter blades). So, an LLM is a sort of "knowledge" store, that can be used for expanding queries, and search results, to make it more likely that a good result seems similar to the input query.
What do I mean by similar? well, the first iteration of this idea is vector similarity (e.g. https://github.com/facebookresearch/DPR). The second iteration is to store the results into the model itself, so that the search operation is performed by the model itself.
This second iteration will lead, IMHO, to a different sort of search engine. Not one over "all the pages" as, in theory at least, google and the like currently work. Instead, it will be restricted to the "well learnt pages", those which, because of volume of repetition, structure of text, or just availability to the training algorithm, get picked up and encoded into the weights.
To make an analogy, is like asking a human who are the Knights of the Round Table and getting back the usual "Percival, Lanceelot and Galahad", but just because the other thousand knights mentioned in some works are not popular enough for that given human to know them.
This is a different sort of search engine than we are used to, one which might be more useful for many (most?) applications. The biases and dangers of it are things we are only starting to imagine.
[+] [-] basch|3 years ago|reply
First, the human input is extremely flexible, but can include instructions. It is natural language programming.
Second, the "conversation" has state. I can give an instruction, and then a followup instruction that adds to the first instruction. Someday down the road there will be two states, your account state (instructions you taught it that it retains as long as you are logged in. Maybe my account can have multiple state buckets/buildings I can enter, one of one set of rules, one for another. Could call them programs or routines. (computer execute study routine)) and temporary state (instructions it retains only for the duration of the conversation/search.)
The exciting part here is being able to query data and manipulate it in memory. Making a search, refining the search, redirecting the search in a different direction when its not working. That collaborative, iterative type search doesnt really exist at the moment. I cant tell google "the results you just returned are garbage, here is why, try again."
It is more like a fuzzy commandline. The chatbotness is just a layer of cute on top, that isnt completely necessary.
[+] [-] softwaredoug|3 years ago|reply
However, it's been observed, people are using chatbots for informational searches. The kinds of searches where you want to learn about a specific fact. This isn't all searches, but it's an important subset for web search. For better or worse, with probably a high degree of inaccuracy, people (probably rightly) perceive this is how people will seek information.
There's also the generational use cases - "write me a program that does X". Is this something people would use a search bar for? We don't know, and wouldn't know, until its out there for a while.
For the longest time the one natural language interface was as a search bar. So search vendors surmise it's important to both defend their turf while also a natural way to get regular users familiar with this kind of informational interaction...
[+] [-] gremlinsinc|3 years ago|reply
Instead it was all good in 20 minutes.
Personally, I think Microsoft has a HUGE opportunity here, imagine if the OpenAI - PRO account was actually free -- but only on edge browsers, or inside bing.com. It'd be a bit of a loss leader for a bit, because a lot of compute is required but it'd also help unseat google. Google's only chance is if lamda/bard is better - and so much better it's noticable from day one.
Their search results are so horrible lately I don't know anybody who isn't 'shopping around' right now for something better. Bing, DDG, Brave, Neeva, You.com, etc.
[+] [-] trieste92|3 years ago|reply
Oh God. I seriously, seriously hope not. I can imagine some coasters I work with shitting out source code doing this
Anyone who trusts an LLM and is committing to a code base that other people work on - I would never want to work with someone with this attitude. Physically writing code takes far less time than understanding code
[+] [-] ergonaught|3 years ago|reply
1) Integration with voice assistants. Links/sources are irrelevant.
2) Models tuned against a particular body of work don't care if links go stale, or websites get SlashdottedHackerNewshuggedDDOSed, or etc. Links/sources are irrelevant.
3) Inbound "service requests" processed by something that can better understand the question and the available answers/solutions. Links don't matter much.
4) When "Okay what are some good websites to read more about this?" can be answered, too, bang.
5) Ever asked somebody a question and just rolled with their answer instead of demanding citations? I mean, you're doing it here. So, again, yes.
[+] [-] verdverm|3 years ago|reply
ChatGPT will already do this, when you get an answer you want to learn more about, ask
"where can I learn more about this?"
"can you provide links?"
there are a number of prompts that work, you can even provide it with an AND clause in the first prompt
[+] [-] anonyfox|3 years ago|reply
Another point is that I am either creative or productive at a time, but never both... at least aware of which state I am in. ChatGPT has proven to take over the other part surprisingly good. ie:
- when I am in a productive mood and stumble upon a thinking problem, generative AI is like on-the-spot creativity for "good enough" solutions, like naming a programming thingy or write some filler text around a few keywords instead of me looking for words.
- when I am in a creative mindset, I increasingly feed some code snippets into the bot and ask some questions to "fill in the gaps", like writing a specific function using library X, then to write a documentation explaining how it works, then to also write some unittests, and sometimes I even derail a bit or let the bot explain parts that stand out in some way so I can maybe learn a trick.
... And i used ChatGPT already in kinda emergency situations, like when I know 5 minutes in advance that I have to speak in front of a crowd/in a meeting and it gave me extremely useful outlines to quickly adapt to even when in a panicked mind state - calming me down through a given structure that sounded okay-ish, and it doesn't matter if the response is right or wrong.
[+] [-] PaulHoule|3 years ago|reply
Generating bullshit text off the cuff is not the only use of LLMs. LLMs can perform very well at classification, regression, ranking, coloring proper names red, and other tasks. You could, for instance, use LLMs to encode a query and documents and rank them with a siamese network, something not too different for how a conventional search engine works.
If there is one thing wrong with the current crop of LLMs it is that these can only attend over a limited number of tokens. BERT can attend over 512 tokens, ChatGPT over 4096, where a token is shorter than a word on average. It easy to process the headline of an HN submission with BERT, but I classify a few hundred abstracts of scientific papers a day. A long abstract is about 1800 words which is too much for longformer but would fit in ChatGPT if there aren't too many $10 words.
Unless you can recast a problem as "does this document have a short window in it with this attribute?" (maybe "did the patient die?" in a clinical case report or "who won the game?" in a sports article) there is no way to cut a document up into pieces and feed it into an LLM, then combine the output vectors in a way that doesn't break the "magic" behavior the LLM was trained to do.
You'd imagine ChatGPT would produce accurate results if you could tell it "Write an review of topic T with citations" but if you try that you'll find it will write citations that look for real but don't actually exist if you look them up. You'd imagine at minimum that such a system would have to read the papers that it cites, maybe being able to attend over all of them at the same time which would take an attention window 100-1000x larger.
That's by no means cheap and it might be Kryptonite for Google in that Google's model involves indexing a huge amount of low quality content and financing it by ads that are a penny a click. A business or individual might get a larger amount of value per query out of a much smaller task-oriented document set.
[+] [-] h2odragon|3 years ago|reply
When you're hunting for a particular fact, like "that bit of code i half remember seeing on a page 15 years ago", then I don't see anything for an LLM to add. Google had a pretty good index for that purpose about 15 years ago, but they've chosen to prioritize other goals since then. I dunno if anyone works "find things you're searching for" as a market now.
Which is an answer to your question: Does it matter if an LLM helps search the web? That's not what people are doing, that's not what these companies are selling.
[+] [-] unknown|3 years ago|reply
[deleted]
[+] [-] fnordpiglet|3 years ago|reply
LLMs have a chance to offer an oracle that doesn’t answer in cryptic or evasive ways, but attempts to just give an answer. The hallucinations are a huge flaw, but one that I’m confident will be addressed with other non-LLM AI approaches. But it’s the right user interface for answering questions - it answers questions with answers, not a pile of potentially relevant documents to sort. The augmentation with citations, especially if they’re semantically revenant rather than symbolically, is a huge plus.
[+] [-] fauigerzigerk|3 years ago|reply
https://you.com/search?q=youchat&tbm=youchat
[+] [-] remarkEon|3 years ago|reply
I don't know if that strictly complies with your definition of "web search application". It's definitely going to save time for me, and not seeing a bunch of ads during the process is wonderful - to the point that I really could see myself paying for it if they decide to go that route and take away the "free" version.
[+] [-] JohnFen|3 years ago|reply
When I think "search engine", I think of something that searches for web pages. What I want as a result isn't an answer, meal plan, grocery list, or any of that higher-order stuff. I want a list of websites.
This reinforces my fear that this sort of approach might kill actual search engines. That would be a pretty serious loss to me, and a further reduction in the usefulness of the web. But saying that is not disparaging things like ChatGPT at all. They address a different, and valid, need.
I just hope we don't lose an important tool in the process.
[+] [-] sologoub|3 years ago|reply
[+] [-] iamwpj|3 years ago|reply
[+] [-] thesuperbigfrog|3 years ago|reply
The basic idea was to have enough metadata about web sites so that you could get programs to do something approaching Prolog-style reasoning about the content and meaning of the web pages.
With more advanced LLMs it looks like a slightly different approach to achive something like the semantic web idea.
I think the idea is to constantly feed the model with updates from crawling the web and have the LLM "digest" the content, apply some filters to remove bad stuff, and then provide a meaningful result to whatever queries it might be asked.
[+] [-] PaulHoule|3 years ago|reply
The intellectual north star of this is
https://en.wikipedia.org/wiki/SAT_solver
and might be a better model than the "semantic web" in that the semantic web has revolved around description logics like OWL that implement a limited subset of first-order logic that is always decidable. One trouble is that OWL-based systems can't handle arithmetic at all, given a height and a weight you can write a rule that says anybody over 6 feet is tall but you can't say a person with a BMI over 30 is obese because OWL can't add, multiply and divide. But something a little more expressive that searches efficiently in practice because it gets "hunches" from a neural net could be useful but it will not evade the limits on computation and logic that Godel, Tarski and Turing warned you about.
[+] [-] dboreham|3 years ago|reply
That idea was the basis for the current generation of Google search (see: Metaweb).
[+] [-] louthy|3 years ago|reply
The issue I see with the chat approach is trust. I've seen so many examples of these models just making shit up now that I reckon regular use of them will eventually lead to mistrust between the human and the chat-bot. If you can't trust the answers and have to go and check yourself, it's dead as an idea IMHO.
[+] [-] Sateeshm|3 years ago|reply
[+] [-] stevenbedrick|3 years ago|reply
> Search systems, like many other applications of machine learning, have become increasingly complex and opaque. The notions of relevance, usefulness, and trustworthiness with respect to information were already overloaded and often difficult to articulate, study, or implement. Newly surfaced proposals that aim to use large language models to generate relevant information for a user’s needs pose even greater threat to transparency, provenance, and user interactions in a search system. In this perspective paper we revisit the problem of search in the larger context of information seeking and argue that removing or reducing interactions in an effort to retrieve presumably more relevant information can be detrimental to many fundamental aspects of search, including information verification, information literacy, and serendipity. In addition to providing suggestions for counteracting some of the potential problems posed by such models, we present a vision for search systems that are intelligent and effective, while also providing greater transparency and accountability.
Shah & Bender (2022) "Situating Search". In Proc. CHIIR '22 https://dl.acm.org/doi/abs/10.1145/3498366.3505816
[+] [-] karaterobot|3 years ago|reply
[+] [-] ESMirro|3 years ago|reply
- The Metaverse was going to change the internet.
- The Internet of things was going to revolutionise our homes and cities.
- Self-driving cars would change the motor industry.
- Crypto is the next big thing in finance.
- NFTs are going to revolutionise digital ownership.
Most of those technologies have introduced niche or useful applications, but the ridiculous, breathless hype about how this technology is the one to change everything is getting more predictable and increasingly frustrating.
[+] [-] foobarbecue|3 years ago|reply
[+] [-] esperent|3 years ago|reply
> every single "AI" search I have tried has given me incorrect, invented answers.
Really? Every single one? Not a single AI (ChatGPT, presumably) query you've tried has given you anything except incorrect information? What kind of things have you been querying for?
[+] [-] ankit219|3 years ago|reply
For quick overview answer, LLMs are great. It's not 100% correct, but mostly it is, and that is good enough for a quick answer. Currently google tries to show that and people object as it is stealing traffic from websites. i just need an answer, a coherent useable one. Eg: "What were the movies Scorsese got an Oscar nomination for?"
For suggestions, LLMs are just one more of those blogs and listicles that are already showing up in search. If LLM is updated that is. The difference is an LLM would customize the answer according to the query, unlike already pre written content. So, yes useful. Same goes with stuff like: "how to build an email list?" or "What is a effective sales strategy?"
For research, Google is more useful. I think we have all done that.
Another application which is not realized at this time because we never did it before is the ability to ask follow up questions (which a chat format enables well). Suppose you get an overview of how a quantum computer works, but it would take a lot of effort to ask a follow up question and get a direct answer via a search engine. Eg: "Why is there no point in going beyond thousand Qubits?"
There could be modifications like voice to text (a jarvis like interface), or a personal assistant thingy. But those are far fetched.
It will help immensely, and for places it does not, we will still google like we have done before.
[+] [-] graypegg|3 years ago|reply
While what they’re doing currently isn’t perfect, it does provide results that are at least traceable. I could imagine an alternate universe where they doubled down on marketing themselves as “the search engine that doesn’t lie to you” or “where answers are found, not stories”.
[+] [-] brookst|3 years ago|reply
[+] [-] mtlmtlmtlmtl|3 years ago|reply
[+] [-] danso|3 years ago|reply
On the other hand, I do agree with people speculating that LLM-AI interfaces will seriously hurt Google's bottom line, e.g. reducing the space for search ads, which represent the majority of its revenue.
[0] https://www.nytimes.com/2022/09/16/technology/gen-z-tiktok-s...
[+] [-] busyant|3 years ago|reply
Mostly this is just to calm me down because ChatGPT gives me the illusion that I'm interacting with a human. The current voice systems are infuriatingly bad.
It would be nice if CVS's phone system would actually listen to me and modify its output accordingly. "I already gave you my birth date. And NO, I don't need a COVID booster."
edit: I'd like to meet the person who sold CVS its prescription web-site and its voice system. Simply to marvel at them and the swindle they pulled off, delivering absolute trash and probably walking away with a king's ransom.
[+] [-] ramesh31|3 years ago|reply
But at least they're deterministic and finite. I imagine ChatGPT like results from a phone system would be even worse. LLM's are incredible when you can massage the answer you want out of it in a feedback loop, but not so much for automated systems.