I think the complaints were more about how pg was originally saying that he intended to never support Unicode. That said, people should realize that UTF-8 encoding/decoding is the zeroth step to internationalization with Unicode.
Do I understand correctly that Arc strings are sequences of octets?
If so: I really don't want to be a negativity guy but it seems like every language that has made an 8-bit string the default string type has regretted it later because it is so painful to change it without breaking code. Okay, Paul says that he won't mind breaking code. Maybe he means it, but it doesn't make any sense to me to knowingly and consciously repeat a design mistake that dozens of other people have made and regretted.
It really just takes one day to get this right. You need to distinguish between the raw bytes read from a device and the true string type (which needs to be 21 bit or greater). You need a trivial converter from one to the other (which you can presumably steal from MZScheme) and back.
That's it. You get this right at the beginning and you never have to backtrack or break code.
My apologies in advance if this post is based on incorrect premises. I'm trying to help.
Could you offer a better solution? What would your solution offer that octets do not? Random character access? No, because not a single unicode encoding offers easy random character access (because they are made of possibly several codepoints, which, in some encodings, are made of more than one basic "chars"). Gylph, word and sentence segmentation? I guess not.
If you are wondering:
On Linux/X11, there's Ctrl+Shift+[unicde number in hexadecimal], gnome-character-map, umap or KCharMap (ت)
And now for the less serious part:
ሞሡሢ Am I the only one whom these Ethiopic characters remind of Tengwar? BTW, are there Unicode chars for Tengwar? I think there should be! (But not for Klingon, because it sucks.)
I have fun wirting this on my ⌨, but ℐ∫ ᚾℍℹ⑀ not pointless? Who cares? Anyway, now we can use distinct characters for Roman numerals: Ⅰ,Ⅱ,Ⅲ,Ⅳ,Ⅴ,Ⅵ,Ⅶ,Ⅷ,Ⅹ,Ⅻ,Ⅽ,Ⅿ!
Ye darn kids! Everythin we had was 7-bit ASCII, without parity, and we were damn greatful for it?
You think you had it bad? I had to use Morse code for browsing porn, back in my days! And I had to etch my public key into the wall of a rotten ol' cave! We did not have this fancy-shmancy routed network, i had to remember the way from here to there all by myself!
---
this post was presented to you by Too Much Coffee.
if you search for "Smjörið er brætt og hveitið smátt og smátt hrært út í það, þangað til það er gengið upp í smjörið." on Google this thread is the fourth result.
किसी वस्तु, व्यक्ति, स्थान, या भावना का नाम बताने वाले शब्द को संज्ञा कहते हैं। जैसे - गोविन्द, हिमालय, वाराणसी, त्याग आदि
संज्ञा में तीन शब्द-रूप हो सकते हैं -- प्रत्यक्ष रूप, अप्रत्यक्ष रूप और संबोधन रूप ।
Well, not quite. I gave Patrick an early version of the code, a couple weeks before Arc was released, and he immediately sent me this fix. I just didn't get around to incorporating it till now.
There's a difference between things I don't care about, and things I'm actively against. I don't care about character sets and css, so those things will no doubt gradually get better.
Classic static typing, however, I think is actually a bad idea in a general-purpose language. It makes languages weaker. So it's never likely to happen in Arc itself. However, one of the explicit goals of Arc is to be a good language for writing other languages on top of, and I can imagine plenty of languages for specific types of problems (e.g. circuit design) in which static typing would be a good idea.
[+] [-] far33d|18 years ago|reply
Apparently, while you were complaining, someone else was solving.
[+] [-] menloparkbum|18 years ago|reply
[+] [-] henning|18 years ago|reply
[+] [-] microdan|18 years ago|reply
[+] [-] gojomo|18 years ago|reply
[+] [-] tocomment|18 years ago|reply
[+] [-] prescod|18 years ago|reply
If so: I really don't want to be a negativity guy but it seems like every language that has made an 8-bit string the default string type has regretted it later because it is so painful to change it without breaking code. Okay, Paul says that he won't mind breaking code. Maybe he means it, but it doesn't make any sense to me to knowingly and consciously repeat a design mistake that dozens of other people have made and regretted.
It really just takes one day to get this right. You need to distinguish between the raw bytes read from a device and the true string type (which needs to be 21 bit or greater). You need a trivial converter from one to the other (which you can presumably steal from MZScheme) and back.
That's it. You get this right at the beginning and you never have to backtrack or break code.
My apologies in advance if this post is based on incorrect premises. I'm trying to help.
[+] [-] olavk|18 years ago|reply
[+] [-] dzorz|18 years ago|reply
[+] [-] nickb|18 years ago|reply
[+] [-] dcurtis|18 years ago|reply
/sarcasm
[+] [-] mdemare|18 years ago|reply
[+] [-] Zak|18 years ago|reply
[+] [-] Tuna-Fish|18 years ago|reply
Make λ an alias of fn, and have it replace automatically in whatever editor you use?
fn is fast to write, but λ is much more readable, 'cos it stands out.
[+] [-] olifante|18 years ago|reply
[+] [-] timr|18 years ago|reply
[+] [-] pchristensen|18 years ago|reply
[+] [-] pg|18 years ago|reply
[+] [-] jamiequint|18 years ago|reply
[+] [-] dmoney|18 years ago|reply
[+] [-] tel|18 years ago|reply
[+] [-] TMCMan|18 years ago|reply
And now for the less serious part:
ሞሡሢ Am I the only one whom these Ethiopic characters remind of Tengwar? BTW, are there Unicode chars for Tengwar? I think there should be! (But not for Klingon, because it sucks.) I have fun wirting this on my ⌨, but ℐ∫ ᚾℍℹ⑀ not pointless? Who cares? Anyway, now we can use distinct characters for Roman numerals: Ⅰ,Ⅱ,Ⅲ,Ⅳ,Ⅴ,Ⅵ,Ⅶ,Ⅷ,Ⅹ,Ⅻ,Ⅽ,Ⅿ! Ye darn kids! Everythin we had was 7-bit ASCII, without parity, and we were damn greatful for it? You think you had it bad? I had to use Morse code for browsing porn, back in my days! And I had to etch my public key into the wall of a rotten ol' cave! We did not have this fancy-shmancy routed network, i had to remember the way from here to there all by myself!
--- this post was presented to you by Too Much Coffee.
[+] [-] r7000|18 years ago|reply
[+] [-] mixmax|18 years ago|reply
Damn fast...
[+] [-] rams|18 years ago|reply
[+] [-] jey|18 years ago|reply
[+] [-] rams|18 years ago|reply
[+] [-] polar|18 years ago|reply
[+] [-] nreece|18 years ago|reply
[+] [-] kmt|18 years ago|reply
[+] [-] ph0rque|18 years ago|reply
[+] [-] kajecounterhack|18 years ago|reply
[+] [-] tel|18 years ago|reply
[+] [-] dzorz|18 years ago|reply
[+] [-] piranha|18 years ago|reply
[+] [-] albertcardona|18 years ago|reply
[+] [-] olavk|18 years ago|reply
[+] [-] pg|18 years ago|reply
There's a difference between things I don't care about, and things I'm actively against. I don't care about character sets and css, so those things will no doubt gradually get better.
Classic static typing, however, I think is actually a bad idea in a general-purpose language. It makes languages weaker. So it's never likely to happen in Arc itself. However, one of the explicit goals of Arc is to be a good language for writing other languages on top of, and I can imagine plenty of languages for specific types of problems (e.g. circuit design) in which static typing would be a good idea.
[+] [-] mixmax|18 years ago|reply
[+] [-] Create|18 years ago|reply
[+] [-] patrickg-zill|18 years ago|reply
[+] [-] bootload|18 years ago|reply
[+] [-] eusman|18 years ago|reply