top | item 16173143

Building Cross-Lingual End-to-End Product Search with Tensorflow

41 points| kawera | 8 years ago |hanxiao.github.io

5 comments

order
[+] nerdponx|8 years ago|reply
Even if you aren't interested in "product search" per se, this is a great technical read about search on hierarchical data, with lots of handsome graphics and without being too dense.
[+] xiao_haozi|8 years ago|reply
I find that the entirety of the author's blog is of similar quality. Even though a very specific context is provided as a point of conveyance, I find that the style of presentation is effective in allowing for generalization of the concepts for other use cases. There are also some really great gems in there - e.g. the fashion MNIST!
[+] sarabande|8 years ago|reply
Agreed, I really enjoyed the technical depth and graphics. A fantastic writeup.
[+] stakecounter|8 years ago|reply
> During the inference time, we first represent user input as a vector using query encoder; then iterate over all available products and compute the metric between the query vector and each of them; finally, sort the results. Depending on the stock size, the metric computation part could take a while. Fortunately, this process can be easily parallelized.

An alternative is to precompute a search index over the item vectors if the dataset of items is very large and you’re OK with running an approximate search to trade a bit of recall for performance, using algorithms provided by libraries like the following.

Nmslib: https://github.com/searchivarius/nmslib

Faiss (Facebook): https://github.com/facebookresearch/faiss

Annoy (Spotify): https://github.com/spotify/annoy

[+] innagadadavida|8 years ago|reply
> Why is m a string/key matching function? Why can’t we use more well-defined math function, e.g. Euclidean distance, cosine function?...

Isn't the purpose of a search index (aka inverted index) to compute the cosine similarity efficiently? Is this not possible to do for latent space dense vectors? Or am I missing something?