top | item 46992192

(no title)

ssiddharth | 17 days ago

My biggest sorrow right now is the fact that my beloved emdash is a major signal for AI generated content. I've been using it for decades now but these days, I almost always pause for a second.

discuss

order

manuelmoreale|17 days ago

> I've been using it for decades now but these days, I almost always pause for a second.

Wrote about this before [0] but my 2c: you shouldn't pause and you should keep using them because fuck these companies and their AI tools. We should not give them the power to dictate how we write.

[0]: https://manuelmoreale.com/thoughts/on-em-dashes

akoboldfrying|17 days ago

That's not really how it works.

Gemini tells me that for thousands of years, the swastika was used as "a symbol of positivity, luck and cosmic order". Try drawing it on something now and showing it to people. Is this an effective way to fight Nazism?

I think it's brave to keep using em dashes, but I don't think it's smart, because we human writers who like using them (myself very much included) will never have the mindshare to displace the culturally dominant meaning. At least, not until the dominant forces in AI decide of their own accord that they don't want their LLMs emitting so many of them.

Lalabadie|17 days ago

For what it's worth, whatever LLMs do extensively, they do because it's a convention in well-established writing styles.

LLMs have a bias towards expertise and confidence due to the proportion of books in their training set. They also lean towards an academic writing style for the same reason.

All this to say, if LLMs write like you were already writing, it means you have very good foundations. It's fine to avoid them out of fear, but you have this Internet stranger's permission to use your em dash pause to think "Oh yeah, I'm the reference for writing style."

the_af|17 days ago

> For what it's worth, whatever LLMs do extensively, they do because it's a convention in well-established writing styles.

I think that's only part of the story. I think that while it's true what LLMs do is somehow represented in their corpus of training data, they also lack any understanding of how to adapt to the context, how to find a suitable "voice", and how not to overdo it, unless you explicitly prompt them otherwise, which is too much of a burden. Their default voice sucks, basically.

So let's say they learned to speak in Redditese. They don't know when not to speak in that voice. They always seem to be trying to make persuasive arguments, follow patterns of "It's not X. It's Y. And you know it (mic drop)." But real humans don't speak like this all the damn time. If you speak like this to your mom or to your closest friends, you're basically an idiot.

It's not that you cannot speak like this. It's that you cannot do it all the time. And that's the real problem with LLMs.

(Sorry, couldn't resist!)

petters|17 days ago

I think that bias is not due to the proportion of books and more due to how they are fine-tuned after the pretraining.

djhn|17 days ago

Aren’t books massively outweighed by the crawled internet corpus?

archagon|17 days ago

To quote Office Space, “Why should I change? He’s the one who sucks.”

parsimo2010|17 days ago

Mostly because when I see an em dash now, I assume that it was written by AI, not that the author is one of the people who puts enough effort into their product that they intentionally use specific sized dashes.

AI might suck, but if the author doesn't change, they get categorized as a lazy AI user, unless the rest of their writing is so spectacular that it's obvious an AI didn't write it.

My personal situation is fine though. AI writing usually has better sentence structure, so it's pretty easy (to me at least) to distinguish my own writing from AI because I have run-on sentences and too many commas. Nobody will ever confuse me with a lazy AI user, I'm just plain bad at writing.

collingreen|17 days ago

To continue the story, the guy saying this got fired and probably wouldn't have without taking this stand.

catoc|17 days ago

Exactly this! I love(d) using em dashes. Now they’ve become ehm dashes, experiencing exactly that pause — that moment of hesitation — that you describe

deron12|17 days ago

AI never uses em dashes in a pair like this, whereas most people who like em dashes do. Anyone who calls paired em dash writing AI is only revealing themselves to be a duffer.

vonunov|16 days ago

Embrace the double hyphen -- it's still attested in Garner's ;)

gnat|17 days ago

We're in the brief window of time when AI's writing style is the weirdness. It's an artifact of the production process, like JPG blur, MP3 distortion, autotune's rigidity. And it didn't take long for those things to become normalized, in fact for them to become artifacts that people proudly adopted and embraced. DJs release tracks built from MP3s samples instead of waves. Autotune is famously a 'sound' that was once something to be subtly added and never confessed to, but which now genres and artists lean into rather than away from.

Long story short: I think emoji in headings and lists, em dashes, and the vile TED Talk paragraph structure of "long sentence with lots of words asking a question or introducing a possibility. followed by. short sentences. rebutting. or affirming." are here to stay. My money is that it gets normalized and embraced as "well of course that's how you best communicate because I see it everywhere."

calvinmorrison|17 days ago

Short sentences were popularized in writing only in the last hundred and fifty years. Styles change.

the_af|17 days ago

Yes, but it's kinda sad, isn't it, that this robotic way of writing in turn teaches a new generation of people how to write?

Also, you forgot the extremely enervating: "It's not X. It's Y. <Clincher>."

AlecSchueler|16 days ago

> "well of course that's how you best communicate because I see it everywhere."

These assumptions might also change though. Up until now any writing you saw "everywhere" was probably written by someone who studied and loved written communication and was brining their artisanal care to the table. That's no longer the case.

It's called slop for a reason. When I come across a GitHub README written by AI I don't feel put off just because the author used AI to write it, I feel frustrated because it's genuinely poorly communicating with me. Fill of extraneous details, artifacts from the conversation, and stuff I already know ("uses GitHub to share the source democratically!").

tkzed49|17 days ago

I've gone back to using two dashes--LLMs typically don't write them that way.

awesome_dude|17 days ago

I'm going to propose that we name this the --gnu-long-form :)

eYrKEC2|17 days ago

I used to enjoy the literate usage of the word "literally".

You'll get over it.

peterashford|17 days ago

Using literally to mean figuratively goes back hundreds of years

4b11b4|17 days ago

Also, unfortunately I have in my global instructions to never use em dashes...

4b11b4|17 days ago

Maybe I'll get over it eventually.

nxobject|17 days ago

What I do – and I know this isn't conventional style – is use ex dashes. (Or, you could use spaces between em dashes, as incorrect as it is.)

vonunov|16 days ago

Chicago says to format dashes like this—and ellipses . . . like this. . . .

AP says to format dashes like this — and ellipses ... like this. ...

Who's "correct"?

OGWhales|17 days ago

I've noticed that LLMs generated text often has spaces around em dashes, which I found odd. They don't always do that, but they do it often enough that it stood out to me since that isn't what you'd normally see.

kimixa|16 days ago

> Or, you could use spaces between em dashes, as incorrect as it is.

That's the normal way of using them in British English. Though they also tend to be the (slightly shorter) en-dashes too.

I feel that style is often pretty common on the "old" internet - possibly related to how they can be so easily be replaced by a hyphen back when ascii was a likely limitation.

treetalker|17 days ago

> Or, you could use spaces between em dashes, as incorrect as it is.

It's a matter of style preference. I support spaces around em-dashes — particularly for online writing, since em-dashes without spaces make selecting and copying text with precision an unnecessary frustration.

By the way,what other punctuation mark receives no space on at least one side?Wouldn't it look odd,make sentences harder to read,and make ideas more difficult to grok?I certainly think so.Don't you? /s

wiseowise|17 days ago

I use it to trigger false positives in haters – why not?

account42|16 days ago

I don't think someone who doesn't want AI slop filtering out someone who gets mad at that to the point of calling them haters is really a false positive.

user____name|16 days ago

My history teacher thought me to use "8==3" instead, the Romans used it to sign their graffities.

kyralis|17 days ago

This is the modern day "I can tell that's photoshopped because I've seen some 'shops in my day." The sooner we stop glorifying the people who think they're magical LLM detectors, the better, frankly.

account42|16 days ago

It doesn't have to be a perfect filter to be a good heuristic. And unless you have a better suggestion how people can avoid slop then it'll keep being used.

itisuseless|17 days ago

The correct thing to do is to use an en-dash with spaces. ;)

Lio|17 days ago

You can still use them — it’s just that they have a new purpose; getting things ignored by AI detection or AI;DR.

Now you can ask for outlandish things at work knowing your boss won’t read it and his summariser will ignore it as slop — win.

Bukhmanizer|17 days ago

You’re absolutely right. I hate AI writing — it’s not that I hate AI, it’s that it makes everything it says sound a specific combination of smug and authoritative — No matter the content. Once you realize it’s not saying anything, that’s the real aha moment.

\s