top | item 22503244

(no title)

Embeddings tables aren't hard on the GPU (being only a lookup table), and the output softmax still requires you do the full matrix-multiply. The label may be sparse, but the computation is far from sparse.

discuss

yvdriess|6 years ago

The reverse is true, embeddings are both the performance and memory-footprint bottleneck of modern NN models.

Check figure 6. of : https://arxiv.org/pdf/1906.00091.pdf

Embeddings are used to lookup sparse features, so you have those pesky data-dependent lookups.

zeroxfe|6 years ago

> The reverse is true, embeddings are both the performance and memory-footprint bottleneck of modern NN models.

They may be a bottleneck, but the alternative is worse -- you can't fit complex models with large vocabularies into GPU memory using sparse one-hot encodings.

Der_Einzige|6 years ago

Only a bad bottleneck because proper database techniques aren't being used widely for embeddings yet within ML pipelines

See libraries like magnitude for proper embedding lookup implementations