Even if you aren't interested in "product search" per se, this is a great technical read about search on hierarchical data, with lots of handsome graphics and without being too dense.
I find that the entirety of the author's blog is of similar quality. Even though a very specific context is provided as a point of conveyance, I find that the style of presentation is effective in allowing for generalization of the concepts for other use cases. There are also some really great gems in there - e.g. the fashion MNIST!
> During the inference time, we first represent user input as a vector using query encoder; then iterate over all available products and compute the metric between the query vector and each of them; finally, sort the results. Depending on the stock size, the metric computation part could take a while. Fortunately, this process can be easily parallelized.
An alternative is to precompute a search index over the item vectors if the dataset of items is very large and you’re OK with running an approximate search to trade a bit of recall for performance, using algorithms provided by libraries like the following.
> Why is m a string/key matching function? Why can’t we use more well-defined math function, e.g. Euclidean distance, cosine function?...
Isn't the purpose of a search index (aka inverted index) to compute the cosine similarity efficiently? Is this not possible to do for latent space dense vectors? Or am I missing something?
[+] [-] nerdponx|8 years ago|reply
[+] [-] xiao_haozi|8 years ago|reply
[+] [-] sarabande|8 years ago|reply
[+] [-] stakecounter|8 years ago|reply
An alternative is to precompute a search index over the item vectors if the dataset of items is very large and you’re OK with running an approximate search to trade a bit of recall for performance, using algorithms provided by libraries like the following.
Nmslib: https://github.com/searchivarius/nmslib
Faiss (Facebook): https://github.com/facebookresearch/faiss
Annoy (Spotify): https://github.com/spotify/annoy
[+] [-] innagadadavida|8 years ago|reply
Isn't the purpose of a search index (aka inverted index) to compute the cosine similarity efficiently? Is this not possible to do for latent space dense vectors? Or am I missing something?