(no title)
unignorant | 6 months ago
1.) probably human, low on style but a solid twist (CORRECT) 2.) interesting imagery but some continuity issues, maybe AI (INCORRECT) 3.) more a scene than a story, highly confident is AI given style (CORRECT) 4.) style could go either way, maybe human given some successful characterization (INCORRECT) 5.) I like the style but it's probably AI, the metaphors are too dense and very minor continuity errors (CORRECT) 6.) some genuinely funny stuff and good world building, almost certainly human (CORRECT) 7.) probably AI prompted to go for humor, some minor continuity issues (CORRECT) 8.) nicely subverted expectations, probably human (CORRECT)
My personal ranking for scores (again blind to author) was:
6 (human); 8 (human); 4 (AI); 1 (human) and 5 (AI) -- tied; 2 (human); 3 and 7 (AI) -- tied
So for me the two best stories were human and the two worst were AI. That said, I read a lot of flash fiction, and none of these stories really approached good flash imo. I've also done some of my own experiments, and AI can do much better than what is posted above for flash if given more sophisticated prompting.
lelanthran|6 months ago
Looking at my notes, I got one wrong (story 5, dunno what the "name" was supposed to be, assumed that the "name" is something widely-known in culture that brings about the end times, a something that I didn't know about, and so marked it as Human because of a supposed reference to a shared cultural knowledge), and all the AI written stories I rated at either 1 or two points, with the lowest Human-written story getting 3 and the highest getting 5 (Story 1).
It makes me wonder if we are over-estimating the skill an author has when reading based on their demonstrated skill when writing.
IOW, according to my notes/performance, the AI stories were easy to spot and correlated with low scores anyway, while the author(s), who actually produced high-rated stuff for me, rated my low-rated stuff as high.
breuleux|6 months ago
> AI can do much better than what is posted above for flash if given more sophisticated prompting.
How sophisticated, compared to just writing the thing yourself?
unignorant|6 months ago
I enjoy writing so a system like this would never replace that for me. But for someone who doesn't enjoy writing (or maybe can't generate work that meets their bar in the Ira Glass sense of taste) I think this kind of setup works okay for generating flash even with today's models.
biffles|6 months ago
I have found it hard to replicate high quality human-written prose and was a bit surprised by the results of this test. To me, AI fiction (and most AI writing in general) has a certain “smell” that becomes obvious after enough exposure to it. And yet I scored worse than you did on the test, so what do I know…
unignorant|6 months ago
From there you have a second prompt to generate a story that follows those details. You can also generate many candidates and have another model instance rate the stories based on both general literary criteria and how well the fit the prompt, then you only read the best.
This has produced some work I've been reasonably impressed by, though it's not at the level of the best human flash writers.
Also, one easy way to get stuff that completely avoids the "smell" you're talking about by giving specific guidance on style and perspective (e.g., GPT-5 Thinking can do "literary stream-of-consciousness 1st person teenage perspective" reasonably well and will not sound at all like typical model writing).
codechicago277|6 months ago
unignorant|6 months ago