It's not cleanly one or the other, really. It's UCS-2-y by `str.length` or `str[i]`, but UTF-16-y by `str.codePointAt(i)` or by iteration (`[...str]` or `for (x of str)`).
Generally though JS's strings are just a list of 16-bit values, being intrinsically neither UCS-2 nor UTF-16. But, practically speaking, UTF-16 is the description that matters for everything other than writing `str.length`/`str[i]`.
And most mainstream GUI toolkits are, as well. It can be said that UTF-16 is the de-facto standard in-memory representation of unicode strings, even though some runtimes (Rust) prefer UTF-8.
demurgos|7 months ago
The native JS semantics are UCS-2. Saying that it's UTF-16 is misleading and confuses charset, encoding and browser APIs.
Ladybird is probably implementing support properly but it's annoying that they keep spreading the confusion in their article.
dzaima|7 months ago
Generally though JS's strings are just a list of 16-bit values, being intrinsically neither UCS-2 nor UTF-16. But, practically speaking, UTF-16 is the description that matters for everything other than writing `str.length`/`str[i]`.
grishka|7 months ago
0points|7 months ago
No. Windows use UTF-16 internally. Most GUI toolkits do not.
> It can be said that UTF-16 is the de-facto standard in-memory representation of unicode strings, even though some runtimes (Rust) prefer UTF-8.
No, that wouldn't be true at all.
Your technical merit seem to be limited by your Windows experience, and even that is dated.
Microsoft recommends UTF-8 over UTF-16 since 2019 [1].
1: https://learn.microsoft.com/en-us/windows/apps/design/global...