top | item 10752625

(no title)

davidism | 10 years ago

> Although that way may not be obvious at first unless you're Dutch.

There is one obvious way: strings are encoded to bytes, bytes are decoded to strings.

discuss

order

andrewstuart|10 years ago

I don't think it is obvious and intuitive and unambiguous what the difference is between ENcoding and DEcoding.

Veedrac|10 years ago

A string is an abstract bit of text. You need to encode this into a particular memory representation of the text.

Bytes hold a bunch of data in some encoding. It could be an image, UTF-8 or LZMA compressed ASCII. Once you know the encoding, to reconstruct the data you decode into a semantically meaningful form.

To put it another way, imagine the terms were "serialize" and "deserialize". Of course one serializes to and deserializes from binary data. Just replace "{,de}serialize" with "{en,de}code" and you're done.

lawpoop|10 years ago

You mean the prefixes 'en-' and 'de-'?

odonnellryan|10 years ago

Encode and Decode are slightly subjective... why not something like to_bytes and from_bytes? Maybe not the best names, but definitely clearer on the meaning.

takeda|10 years ago

Not really.

Veedrac had a good analogy, think of text as something abstract, for example imagine text is an image or sound, if you want to store it in bytes you need to encode it, and to read back you decode it.

As to_bytes/from_bytes, actually python provides it too:

to_bytes -> bytes(<text>)

from_bytes -> str(<bytes>)

jes5199|10 years ago

I think that's backwards

rspeer|10 years ago

It's not backwards.

I think that reveals that the names really do have a problem. The problem is that "encode" sounds like "make this Unicode" to people who aren't familiar with Unicode.