top | item 46691164

(no title)

It's telling that king – man + woman = queen is the only example I've ever seen used to explain word2vec.

I prefer the old school

    king(X) :- monarch(X), male(X).
    queen(X) :- monarch(X), female(X).
    queen(X) :- wife(Y, X), king(Y).

    monarch(elizabeth).
    female(elizabeth).
    wife(philip, elizabeth).
    monarch(charles).
    male(charles).
    wife(charles, camilla).

    ?- queen(camilla).
    true.

    ?- king(charles).
    true.

    ?- king(philip).
    false.

where definitions are human readable rules and words are symbols.

discuss

nerdponx|1 month ago

The difference is that Word2Vec "learned" these relationships auto-magically from the patterns in the surrounding words in the context in which they appear in written text. Don't forget that this was a revolutionary result at the time, and the actual techniques involved were novel. Word2Vec is the foundation of modern LLMs in many ways.

nerdponx|1 month ago

I can't edit my own post but there are two other big differences between the Prolog example and the Word2Vec example.

1. The W2V example is approximate. Not "fuzzy" in the sense of fuzzy logic. I mean that Man Woman Queen King are all essentially just arrows pointing in different directions (in a high dimensional space). Summing vectors is like averaging their angles. So subtracting "King - Man" is a kind of anti-average, and "King - Man + Woman" then averages that intermediate thing with "Woman", which just so happens to yield a direction very close to that of "Queen". This is, again, entirely emergent from the algorithm and the training data. It's also probably a non-representative cherry picked example, but other commenters have gone into detail about that and it's not the point I'm trying to make.

2. In addition to requiring hand-crafted rules, any old school logic programming system has to go through some kind of a unification or backtracking algorithm to obtain a solution. Meanwhile here we have vector arithmetic, which is probably one of the fastest things you can do on modern computing hardware, not to mention being linear in time and space. Not a big deal in this example, could be quite a big deal in bigger applications.

And yes you could have some kind of ML/AI thing emit a Prolog program or equivalent but again that's a totally different topic.

maxweylandt|1 month ago

I've seen in readings (and replicated myself on a set of embeddings derived from google books/news) the capital cities:

Berlin - Germany + France = Paris , that sort of thing

DonaldFisk|1 month ago

again,

    capital(germany, berlin).
    capital(france, paris).

is clearer.

Someone once told me you need humongous vectors to encode nuance, but people are good at things computers are bad at, and vice-versa. I don't want nuance from computers any more than I want instant, precise floating point calculations from people.

bee_rider|1 month ago

What’s Hamburg - Germany + France?

magimas|1 month ago

this completely misses how crazy word2vec is. The model doesn't get told anything about word meanings and relationships and yet the training results in incredibly meaningful representations that capture many properties of these words.

And in reality you can use it in much broader applications than just words. I once threw it onto session data of an online shop with just the visited item_ids one after another for each individual session. (the session is the sentence, the item_id the word) You end up with really powerful embeddings for the items based on how users actually shop. And you can do more by adding other features into the mix. By adding "season_summer/autumn/winter/spring" into the session sentences based on when that session took place you can then project the item_id embeddings onto those season embeddings and get a measure for which items are the most "summer-y" etc.

robocat|1 month ago

What happens when you add

  male(philip).

If you're missing that rule then you're getting what you would expect?

jhbadger|1 month ago

It would still work -- the issue both in the Prolog rules and in real life was that monarch(philip) wasn't true, hence why he was just Prince Philip.