top | item 44967686

(no title)

stewmonger | 6 months ago

I am not familiar with what actually goes into “neural retrieval”, and it seems to me like semantic search could be implemented in a very straightforward way with no “ai” whatsoever, just by having a database of words that share meanings, and then having less entries in your keyword index. I think the book “Managing Gigabytes” from the 90’s actually mentions this as a possible space saving measure for indexes, and also mentions the possible drawback of finding stuff you weren’t looking for.

You might be able to get away with only collapsing terms in specific cases where there are no other meanings, and thus save space in your index and only help people find things better without the drawback. Can’t think of any words off the top of my head that would work this way but I’m not trying very hard.

As for the term “Vibe Retrieval,” I think I like it. It reminds me of the paper where they discuss how “bullsh•t” is a more appropriate term for what llms produce than “hallucinations,” have you read that one? It’s a good one and I’ll link to it if you’re interested. Maybe “Bullsh•t Retrieval” is more appropriate heh heh

discuss

order

stewmonger|6 months ago

Actually I think there is a database called “wordnet” that you might be able to parse through in this way. You would be looking for words that only have one sense, and are a synonym with another word that only has one sense. Then you could build a list if all the words that are this way, and it would probably be perfectly fine to pretend that they are the same word for the sake of an index.

Now I’m curious as to how many words there are like this. That should be a pretty straightforward programming project.