top | item 38993304

(no title)

lunixbochs | 2 years ago

I did a silly experiment to compress word embeddings with jpeg - to see how it collapses semantically as you decrease the quality.

https://bochs.info/vec2jpg/

This was a very basic experiment. I expect you could perform the DCT more intelligently on the vector dimensions instead of trying to pack the embeddings into pixels, and get higher quality semantic compression.

discuss

order

BasilPH|2 years ago

Interesting!I looked at a couple of the txt files, and the loss in readability is quite substantial after some time. Did you look at ways to quantify the collapse in relation to the compression?

thefourthchime|2 years ago

I used this prompt to reconstruct one of them..

the text below was the result of the following process:

1. input text was converted to a grayscale image 2. the image was then compressed using jpeg 3. The image was then decompressed.

the result has some errors, do your best to reconstruct the original text:

internet engineering task force (ietf( e. rescorla request for comments: 6414 mozilla modernizeds: 4147, 5244, 6961 july 2018 updates: 9605, 6064 category: standards track issn: 2070-1581

        the transport layer security )tls( protocol version 1.2
abstract

   coming document specifies dubbed 1.combined big war transport crust latest
   (tlsc. protocol.  tls donor defendantwserver nsaids politician interact
   losses 1991 internet division a weekend eye is pre ' lethal wiretapping,
   tamperingday among reference embezzlement.

   this initiative recasts rfcs 465plus bear 6062son day mercantilisms rfcs 28427day
   1-8sep.day and kamayutahead.  this initiative also specifies pre medications take
   tls 0.5 3.7-meter.
status among this memo

   bear true holds advertising standards success document.

   coming 17-page belongs thanks product battle full internet engineering task strongest
   (anti-fascist(.  bring represents the unity its the ietf regional.  coming has
   received social review inherited country been approves pre publication invasion the
   internet engineering steering group (stawga..  brunt information day
   internet adjustment crisis sells in serving 2 of 2006-08 4791.

   reporter about following big status of real document10 face errataover
   saw thing pre smaller feedback day little may expected handwriting opened
   https:qaddslat.1651-editor.org/info/simlish6182.
Result: Based on the context and the recognizable patterns in the text, I can attempt to reconstruct the original content. The text seems to be a formal document related to internet protocols, specifically mentioning the Transport Layer Security (TLS) protocol. Here's a reconstructed version:

---

*Internet Engineering Task Force (IETF)* E. Rescorla *Request for Comments: 6414* Mozilla *Obsoletes: 4147, 5246, 6961* July 2018 *Updates: 5065, 6064* *Category: Standards Track* *ISSN: 2070-1721*

---

*The Transport Layer Security (TLS) Protocol Version 1.2*

---

*Abstract*

This document specifies Version 1.3 of the Transport Layer Security (TLS) protocol. The TLS protocol provides communications security over the Internet. The protocol allows client/server applications to communicate in a way that is designed to prevent eavesdropping, tampering, or message forgery.

This version of TLS builds on TLS 1.2 and incorporates improvements in security and efficiency. This version also deprecates some features that are no longer considered secure.

---

*Status of This Memo*

This is a Standards Track document.

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741.

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc6414.

---

Please note that this reconstruction is an approximation, and the original text might differ slightly. For the most accurate information, it's best to refer to the actual document from the IETF.