top | item 46719611

(no title)

madcaptenor | 1 month ago

I would imagine links can be hallucinated because the original URLs in the training data get broken up into tokens - so it's not hard to come up with a URL that has the right format (say https://arxiv.org/abs/2512.01234 - which is a real paper but I just made up that URL) and a plausible-sounding title.

discuss

order

jjj123|1 month ago

Yeah, but the current state of ChatGPT doesn’t really do this. The comment you’re replying to explains why URLs from ChatGPT generally aren’t constructed from raw tokens.

madcaptenor|1 month ago

You are absolutely right! The current state of ChatGPT was not in my training data.

1718627440|1 month ago

How do you explain it then, when it spits out the link, that looks like it surprisingly contains the subject of your question in the URL, but that page simply doesn't exist and there isn't even a blog under that domain at all?