top | item 45491618

(no title)

Alex3917 | 4 months ago

> But then at the end it added a "Fun fact" that unicode actually does have a seahorse emoji, and proceeded to melt down in the usual way.

To be fair, most developers I’ve worked with will have a meltdown if I try to start a conversation about Unicode.

E.g. if during a job interview the interviewer asks you to check if a string is a palindrome, try explaining why that isn’t technically possible in Python (at least during an interview) without using a third-party library.

discuss

order

usrnm|4 months ago

Just slap a "assert foo.isascii()" at the beginning and proceed? It's just an interview

derefr|4 months ago

> try explaining why that isn’t technically possible in Python (at least during an interview) without using a third-party library.

I'm actually vaguely surprised that Python doesn't have extended-grapheme-cluster segmentation as part of its included batteries.

Every other language I tend to work with these days either bakes support for UAX29 support directly into its stdlib (Ruby, Elixir, Java, JS, ObjC/Swift) or provides it in its "extended first-party" stdlib (e.g. Golang with golang.org/x/text).

Cthulhu_|4 months ago

> try explaining why that isn’t technically possible in Python (at least during an interview) without using a third-party library.

You're more likely to impress the interviewer by asking questions like "should I assume the input is only ASCII characters or the complete possible UTF-8 character set?"

A job interview is there to prove you can do the job, not prove your knowledge and intellect. It's valuable to know the intricacies of Python and strings for sure, but it's mostly irrellevant for a job interview or the job itself (unless the job involves heavy UTF-8 shenanigans, but those are very rare)

kasey_junk|4 months ago

Don’t leave me in suspense! Why isn’t possible?

zimpenfish|4 months ago

At a guess, there's nothing in Python stdlib which understands graphemes vs code points - you can palindrome the code points but that's not necessarily a palindrome of what you "see" in the string.

(Same goes for Go, it turns out, as I discovered this morning.)

watwut|4 months ago

Are you trying to start a conversation about unicode or intentionally pretending you dont understand what the interviewer asked for with "string is a palindrome" question?

Cause if you are intentionally obtuse, it is not meltdown to conclude you are intentionally obtuse.

nomel|4 months ago

These sorts of questions are what I call “Easter eggs”. If someone understands the actual complexity of the question being asked, they’ll be able to give a good answer. If not, they’ll be able to give the naive answer. Either way, it’s an Easter egg, and not useful on its own since the rest of the interview will be representative. The thing they are useful for is amplifying the justification. You can say “they demonstrated a deeper understanding of Unicode by pointing out that a naive approach could be incorrect”.

reaperducer|4 months ago

To be fair, most developers I’ve worked with will have a meltdown if I try to start a conversation about Unicode.

Why are we being "fair" to a machine? It's not a person.

We don't say, "Well, to be fair, most people I know couldn't hammer that nail with their hands, either."

An LLM is a machine, and a tool. Let's not make excuses for it.

BobaFloutist|4 months ago

> Why are we being "fair" to a machine?

We aren't, that turn of phrase is only being used to set up a joke about developers and about Unicode.

It's actually a pretty popular form these days:

a does something patently unreasonable, so you say "To be fair to a, b is also patently unreasonable thing under specific detail of the circumstances that is clearly not the only/primary reason a was unreasonable."

saltyoldman|4 months ago

I think people are making explanations for it - because it's effectively a digital black box. So all we can do is try to explain what it's doing. Saying "be fair" is more colloquial expression in this sense. And the reason he's comparing it to developers and unicode is a funny aside about the state of things with unicode. And Besides that, LLMs only emit what they emit because it's trained on all those said people.