A string is an abstract bit of text. You need to encode this into a particular memory representation of the text.
Bytes hold a bunch of data in some encoding. It could be an image, UTF-8 or LZMA compressed ASCII. Once you know the encoding, to reconstruct the data you decode into a semantically meaningful form.
To put it another way, imagine the terms were "serialize" and "deserialize". Of course one serializes to and deserializes from binary data. Just replace "{,de}serialize" with "{en,de}code" and you're done.
Encode and Decode are slightly subjective... why not something like to_bytes and from_bytes? Maybe not the best names, but definitely clearer on the meaning.
Veedrac had a good analogy, think of text as something abstract, for example imagine text is an image or sound, if you want to store it in bytes you need to encode it, and to read back you decode it.
As to_bytes/from_bytes, actually python provides it too:
I think that reveals that the names really do have a problem. The problem is that "encode" sounds like "make this Unicode" to people who aren't familiar with Unicode.
andrewstuart|10 years ago
Veedrac|10 years ago
Bytes hold a bunch of data in some encoding. It could be an image, UTF-8 or LZMA compressed ASCII. Once you know the encoding, to reconstruct the data you decode into a semantically meaningful form.
To put it another way, imagine the terms were "serialize" and "deserialize". Of course one serializes to and deserializes from binary data. Just replace "{,de}serialize" with "{en,de}code" and you're done.
lawpoop|10 years ago
odonnellryan|10 years ago
takeda|10 years ago
Veedrac had a good analogy, think of text as something abstract, for example imagine text is an image or sound, if you want to store it in bytes you need to encode it, and to read back you decode it.
As to_bytes/from_bytes, actually python provides it too:
to_bytes -> bytes(<text>)
from_bytes -> str(<bytes>)
jes5199|10 years ago
rspeer|10 years ago
I think that reveals that the names really do have a problem. The problem is that "encode" sounds like "make this Unicode" to people who aren't familiar with Unicode.