(no title)
jncraton | 2 years ago
> We train our model to decode text embeddings from two state-of-the-art embedding models, and also show that our model can recover important personal information (full names) from a dataset of clinical notes.
https://arxiv.org/pdf/2310.06816.pdf
There's certainly information loss, but there is also a lot of information still present.
simonw|2 years ago
“a multi-step method that iteratively corrects and re-embeds text is able to recover 92% of 32-token text inputs exactly”.