(no title)
kovek
|
10 days ago
I think that semantically this question is too similar to the car wash one. Changing subjects from car to elephant and car wash to creek does not change the fact that they are subjects. The embeddings will be similar in that dimension.
1024core|10 days ago
willis936|10 days ago
kovek|10 days ago
Each vector has many many dimensions, and when we train the LLMs, their internal understanding of those vectors sees all sorts of dimensions. A simple way to visualize this is a word's vector being <1, 180, 1, 3, ... > which would all mean a certain value at that dimension. In this example say the dimensions are <gender, height in cm, kindness, social title/job, ...> . In this case, our example LLM could have learned that the example I gave is <Woman, 180, 100% kind, politician, ... >. The vector's undergo some transformation so every dimension is not that discretely clear cut.
In this case, elephant and car both semantically look very similar to vehicles. They basically would have most vectors very similar.
See this article. It shows that once you train an LLM, and you assign an embedding vector for each token, then you can see how the LLM can distinguish the difference between king and queen: man and woman.
https://informatics.ed.ac.uk/news-events/news/news-archive/k...
news_hacker|9 days ago