For image retrieval, have you tried using a model trained with contrastive learning (e.g. SimCLR)? This could produce better embeddings for retrieval since the model is trained to explicitly minimize euclidean distance between similar pairs.
Thanks for the reference! Nice outline of various ANN approaches.
I haven't tried SimCLR, but I did try face embedding models trained with contrastive and triplet loss. For applications where precision is the key metric, I do agree that these loss functions are much better overall.
If discovery or recall is what you're after, a generic image classification model trained with binary cross-entropy might be better. For example, performing reverse image search on a photo of a German Shepherd should always return images of GSheps in the first N pages, but showing other dog breeds in later pages and possibly even cats after that would be a desirable feature for many search/retrieval solutions. An embedding model trained with contrastive loss might have this behavior to a certain extent, but a model based on BCE should be better.
mrintellectual|3 years ago
If discovery or recall is what you're after, a generic image classification model trained with binary cross-entropy might be better. For example, performing reverse image search on a photo of a German Shepherd should always return images of GSheps in the first N pages, but showing other dog breeds in later pages and possibly even cats after that would be a desirable feature for many search/retrieval solutions. An embedding model trained with contrastive loss might have this behavior to a certain extent, but a model based on BCE should be better.