top | item 34676724

(no title)

pharke | 3 years ago

It's very much in their interest, if the information their models provide is impossible to verify then it severely limits its uses. You essentially can't use it as a source for anything that requires any type of citation or reliability. That's a huge handicap for selling it to businesses and researchers. The general problem of determining what training data was used to produce an output is an open problem in ML and one that is being very actively worked on since it would greatly further the field.

You believe correctly that ChatGPT is not capable of showing sources, it's currently impossible to do but we were discussing Tomorrow so I included it as a possibility. You could potentially hack it in now using traditional search or nearest neighbours but it wouldn't be 100% accurate, probably not even 50%, it would just show a bag of similar texts so not really worth doing.

I'd still be in the market for a book even if we had a perfect LLM that could answer every question I had with impeccable accuracy. I read books because I want to find out about things I don't know that I don't know. It's pretty hard to find those things if you just do question response. It's like a graph, if you start at one node it may take you a very long time to traverse the graph to another node but if you have some outside source that gives you the address of a new node you can just jump straight to it.

discuss

No comments yet.