(no title)
RustyRussell | 6 months ago
unicode-assignable =
%x9 / %xA / %xD / ; useful controls
%x20-7E / ; exclude C1 controls and DEL
%xA0-D7FF / ; exclude surrogates
%xE000-FDCF / ; exclude FDD0 nonchars
%xFDF0-FFFD / ; exclude FFFE and FFFF nonchars
%x10000-1FFFD / %x20000-2FFFD / ; (repeat per plane)
%x30000-3FFFD / %x40000-4FFFD /
%x50000-5FFFD / %x60000-6FFFD /
%x70000-7FFFD / %x80000-8FFFD /
%x90000-9FFFD / %xA0000-AFFFD /
%xB0000-BFFFD / %xC0000-CFFFD /
%xD0000-DFFFD / %xE0000-EFFFD /
%xF0000-FFFFD / %x100000-10FFFD
I mean, just define ranges.Also, where are the test vectors? Because when I implement this, that's the first thing I have to write, and you could save me a lot of work here. Bonus points if it's in JSON and UTF-8 already, though the invalid UTF-8 in an RFC might really gum things up: hex encode maybe?
timbray|6 months ago
RustyRussell|6 months ago
How does this help me check my implementation? I guess I could ask ChatGPT to convert your tests to my code, but that seems the long way around.