top | item 46719463

(no title)

taude | 1 month ago

how is it hallucinating links? The links are direct links to the webpage that they vectorized or whatever as input to the LLM query. In fact, on almost all LLM responses DuckDuckGo and Google, the links are right there as sited sources that you click on (i know because I'm almost always clicking on the source link to read the original details, and not the made up one

discuss

madcaptenor|1 month ago

I would imagine links can be hallucinated because the original URLs in the training data get broken up into tokens - so it's not hard to come up with a URL that has the right format (say https://arxiv.org/abs/2512.01234 - which is a real paper but I just made up that URL) and a plausible-sounding title.

jjj123|1 month ago

Yeah, but the current state of ChatGPT doesn’t really do this. The comment you’re replying to explains why URLs from ChatGPT generally aren’t constructed from raw tokens.

strange_quark|1 month ago

I’ve used Claude code to debug and sometimes it’ll say it knows what the issue is, then when I make it cite a source for its assertions, it will do a web search and sometimes spit out a link whose contents contradict its own claim.

One time I tried to use Gemini to figure out 1950s construction techniques so I could understand how my house was built. It made a dubious sounding claim about the foundation, so I had it give me links and keywords so I could find some primary sources myself. I was unable to find anything to back up what it told me, and then it doubled down and told me that either I was googling wrong or that what it told me was a historical “hack” that wouldn’t have been documented.

These were both recent and with the latest models, so maybe they don’t fully fabricate links, but they do hallucinate the contents frequently.

exmadscientist|1 month ago

> maybe they don’t fully fabricate links

Grok certainly will (at least as of a couple months ago). And they weren't just stale links either.