top | item 27725396

(no title)

lmcinnes | 4 years ago

If this is a thing you want to be able to do efficiently then ParametricUMAP (see [docs](https://umap-learn.readthedocs.io/en/latest/parametric_umap....) and [the paper](https://arxiv.org/abs/2009.12981)) will be very effective. It uses a neural network to learn a mapping directly from data to embedding space using a UMAP loss. Pushing new data through is only slightly more expensive than PCA, so being part of an inference pipeline is fine.

discuss

order

nestorD|4 years ago

But then you have to train a neural network and lose on the speed advantage of UMAP (offline yes but still much slower and finicky).

lmcinnes|4 years ago

It is really not that much slower for training (see the paper), and if you are interested in pipelines the difference is not so great considering you are looking at a one off training time vs. lots of inference.