(no title)
ian0
|
2 years ago
Perhaps a dumb question, but why do we store embeddings as "vectors" and not "points"? I thought the difference was that vectors have magnitude, but an embedding doesn't have a magnitude - they are just points in an n-dimensional space?
tobinfricke|2 years ago
The difference is that mathematical vectors support some additional operations, such as addition and scalar multiplication.
A vector in C++ is not a mathematical vector, since we can't add two vectors x+y, nor can we perform scalar multiplication a*v.
For mathematical vectors we have this interpretation: An abstract vector is constructed by multiplying the numbers in the ordered tuple each by a corresponding abstract "basis vector" and adding up the results. The numbers are just the "coordinates" of a vector with respect to a particular basis.
It may or may not make sense to talk about "basis vectors" in your application.
Does it make sense to perform "coordinate transformations" on your objects?
Another test is, "Do you want to use linear algebra?" If so, your objects are probably vectors.
A similar but more egregious argument comes about with regard to tensors. Mathematically a tensor is a kind of function that takes vectors and co-vectors as arguments. A matrix, when coupled with the rules for multiplying matrices by row and column vectors, is a tensor.
But an arbitrary n-dimensional array of numbers is not (necessarily) a tensor in the mathematical sense. Unfortunately that term was co-opted by the ML crowd because it sounds cool. :-)
Getting back to what you mentioned about vectors having magnitude - in an abstract vector space, there is no definition of magnitude. It's not until you define an inner product that magnitude becomes defined. In this sense the grade-school definition of a vector as "a quantity with a magnitude and direction" does not necessarily comport with the standard definition of a mathematical vector space.
We like to say that "a tensor is an object that transforms like a tensor" and the same is true for vectors. "A vector is an object that transforms like a vector" under coordinate transformations, while also supporting addition and scalar multiplication.
To address your question more directly: typically "points" (in so much as they are relative to a coordinate system) really are "vectors". But general tuples of numbers are not necessarily.
dexwiz|2 years ago
im3w1l|2 years ago
Given a coordinate system, a vector can be represented by a tuple of numbers, just like a point can be represented by a tuple of numbers. The point p, and its position vector i.e. the vector from the origin to p will have the same tuple of numbers. The magnitude of the vector corresponds to the distance of p to the origin. So points or vectors, well it's just a choice of words without a material difference.
If you use the word vectors then you do kind of sort of imply that you could do the vector operations, scalar-multiplication, and vector addition, and getting something semi-useful out. This is indeed sometimes done with embeddings. But most of the time it's in the form of an affine combination (weighted sum where weights sum to 1) which is something you can do for points too.
Edited a bit
chriswarbo|2 years ago
The concept that we call a "point" gives us an (n-dimensional) "affine space". Affine spaces don't require any sort of coordinate system, axes, origin, etc. which makes them quite general. For example, consider the sleepers of a railway track, or the crossing-points of a chain-link fence, or the hands on a clock face, or the electrical potential at various positions around a circuit, or a date, or the temperatures of various objects, etc.
It makes no sense to "add" or "multiply" points; but we can find the difference between two points. The result will actually be a vector (in the examples above: a distance and direction along the track; a distance and direction along the 2D plane of the fence; an angle; a voltage; a duration; and a temperature difference). We can add such vectors to our points; if we add the vector (pointA - pointB) to pointB, we get pointA! This relationship between points and vectors leads to the concept of a "torsor" https://math.ucr.edu/home/baez/torsors.html
Vectors live in an (n-dimensional) "vector space", which requires more concepts than an affine space; e.g. a notion of "zero", a notion of "size", a notion of "direction", etc. This is less general, but lets us do operations like adding and scaling vectors, as well as the various notions of multiplication defined here.
Some vector spaces arise naturally, e.g. taking the angle between clock hands gives us a natural zero (the difference between identical positions) and a natural size (a full turn), although whether positive/negative indicates clockwise/anticlockwise is still arbitrary. Other times we will "impose" some arbitrary coordinate system on an affine space, since vector operations are so useful; often ignoring the space's affine nature entirely! That way we can treat "point" interchangeably with "vector from the origin"; even though most of the fancy things we want to do are only defined for the latter (e.g. taking dot-products, comparing cosine similarity, etc.)
For example, ere are some arbitrary coordinates that we impose on affine spaces every day:
- The top of a clock is 12:00
- The Greenwich meridian
- Grounding/earthing electrical circuits
- Celsius and Fahrenheit
- 0AD, New Year's Day, the Unix epoch