For student assignment cheating, only really the em dashes would still be in the output. But there are specific words and turns of phrases, specific constructions (e.g., 'it's not just x, but y'), and commonly used word choices. Really it's just a prim and proper corporate press release style voice -- this is not a usual university student's writing voice. I'm actually quite sure that you'd be able to easily pick out a first pass AI generated student assignment with em dashes removed from a set of legitimate assignments, especially if you are a native English speaker. You may not be able to systematically explain it, but your native speaker intuition can do it surprisingly well.
What AI detectors have largely done is try to formalize that intuition. They do work pretty well on simple adversaries (so basically, the most lazy student), but a more sophisticated user will do first, second, third passes to change the voice.
No. No one is looking for em-dashes, except for some bozos on the internet. The "default voice" of all mainstream LLMs can be easily detected by looking at the statistical distribution of word / token sequences. AI detector tools work and have very low false negatives. They have some small percentage of false positives because a small percentage of humans pick up the same writing habits, but that's not relevant here.
The "humanizer" filters will typically just use an LLM prompted to rewrite the text in another voice (which can be as simple as "you're a person in <profession X> from <region Y> who prefers to write tersely"), or specifically flag the problematic word sequences and ask an LLM to rephrase.
They most certainly don't improve the "correctness" and don't verify references, though.
dbg31415|29 days ago
Ha. Every time an AI passionately agrees with me, after I’ve given it criticism, I’m always 10x more skeptical of the quality of the work.
glitchcrab|29 days ago
emmp|29 days ago
What AI detectors have largely done is try to formalize that intuition. They do work pretty well on simple adversaries (so basically, the most lazy student), but a more sophisticated user will do first, second, third passes to change the voice.
the_fall|29 days ago
The "humanizer" filters will typically just use an LLM prompted to rewrite the text in another voice (which can be as simple as "you're a person in <profession X> from <region Y> who prefers to write tersely"), or specifically flag the problematic word sequences and ask an LLM to rephrase.
They most certainly don't improve the "correctness" and don't verify references, though.
smrtinsert|29 days ago