top | item 43888949

(no title)

bost-ty | 10 months ago

I like the author's take: it isn't a value judgement on the individual using ChatGPT (or Gemini or whichever LLM you like this week), it's that the thought that went into making the prompt is, inevitably, more interesting/original/human than the output the LLM generates afterwards.

In my experiments with LLMs for writing code, I find that the code is objectively garbage if my prompt is garbage. If I don't know what I want, if I don't have any ideas, and I don't have a structure or plan, that's the sort of code I get out.

I'd love to hear any counterpoints from folks who have used LLMs lately to get academic or creative writing done, as I haven't tried using any models lately for anything beyond helping me punch through boilerplate/scaffolding on personal programming projects.

discuss

order

vunderba|10 months ago

This is the CRUX of the issue. Even with SOTA models (Sonnet 3.5, etc) - the more open-ended your prompt - the more banal and generic the response. It's GIGO turtles all the way down.

I pointed this out a few weeks ago with respect to why the current state of LLMs will never make great campaign creators in Dungeons and Dragons.

We as humans don't need to be "constrained" - ask any competent writer to sit quietly and come up with a novel story plot and they can just do it.

https://news.ycombinator.com/item?id=43677863

That being said - they can still make AMAZING soundboards.

And if you still need some proof, crank the temperature up to 1.0 and pose the following prompt to ANY LLM:

  Come up with a self-contained single room of a dungeon that involves an 
  unusual puzzle for use with a DND campaign. Be specific in terms of the 
  puzzle, the solution, layout of the dungeon room, etc. It should be totally 
  different from anything that already exists. Be imaginative. 
I guarantee 99% of the returns will return a very formulaic physics-based puzzle response like "The Resonant Hourglass", or "The Mirror of Acoustic Symmetry", etc.

WatchDog|10 months ago

When using Claude Sonnet 3.7 for coding, I often find that constraints I add to the prompt, end up producing unintended side effects.

Some examples:

- "Don't include pointless comments." - The model doesn't keep track of what it's doing as well, I generally just do another pass after it writes the code to simplify things.

- "Keep things simple" - The model cuts corners(often unnecessarily) on things like type safety.

- "Allow exceptions to bubble up" - Claude deletes existing error handling logic. I found that Claude seems to prefer just swallowing errors and adding some logging, instead of fixing the underlying cause of the error, but adding this to the prompt just caused it to remove the error handling that I had added myself.

johnfn|10 months ago

> I guarantee 99% of the returns will return a very formulaic physics-based puzzle response like "The Resonant Hourglass"

Haha, I was suspicious, so I tried this, and I indeed got an hourglass themed puzzle! Though it wasn't physics-based - characters were supposed to share memories to evoke emotions, and different emotions would ring different bells, and then you were supposed to evoke a certain type of story. Honestly, I don't know what the hourglass had to do with it.

sillysaurusx|10 months ago

Temperature 1.0 results are awful regardless of domain. 0.7 to 0.8 is the sweet spot. No one seems to believe this till they see for themselves.

Nezteb|10 months ago

Out of curiosity, I used your prompt but added "Do not make it a very formulaic physics-based puzzle."

The output is pretty non-sensical: https://pastebin.com/raw/hetAvjSG

chipsrafferty|10 months ago

# The Synesthetic Challenge Chamber

## Room Layout

The room is a simple 30-foot square with a single exit door that's currently sealed. In the center sits a large stone cube (roughly 5 feet on each side) covered in various textured surfaces - some rough like sandpaper, others smooth as glass, some with ridged patterns, and others with soft fabric-like textures.

Around the room, six distinct scent emitters are positioned, each releasing a different aroma (pine, cinnamon, ocean breeze, smoke, floral, and citrus). The room is otherwise empty except for a small stone pedestal near the entrance with a simple lever.

## The Puzzle Concept

This puzzle operates on "synesthetic translation" - converting sensory experiences across different senses. The core concept is entirely verbal and tactile, making it fully accessible without visual components.

## How It Works

When players pull the lever, one of the scent emitters activates strongly, filling the room with that particular aroma. Players must then approach the central cube and touch the texture that corresponds to that smell according to a hidden synesthetic logic.

The connection between smells and textures follows this pattern: - Pine scent → ridged texture (like tree bark) - Cinnamon → rough, granular texture (like spice) - Ocean → smooth, undulating surface (like waves) - Smoke → soft, cloudy texture (like mist) - Floral → velvet-like texture (like petals) - Citrus → bumpy, pitted texture (like orange peel)

After correctly matching three smell-texture pairs in sequence, the door unlocks. However, an incorrect match causes the lever to reset and a new random smell to emerge.

## Communication & Accessibility

The DM describes the smells verbally when they're activated and can describe the various textures when players explore the cube by touch. The entire puzzle can be solved through verbal description, touch, and smell without requiring sight.

For extra accessibility, the DM can add: - Distinct sounds that play when each scent is released - Textured surfaces that have subtle temperature differences - Verbal clues discovered through successful matches

## What Makes This Unique

This puzzle uniquely relies on cross-sensory associations that aren't commonly used in dungeons. It: - Doesn't rely on visuals at all - Uses smell as a primary puzzle component (rare in D&D) - Creates unusual connections between different senses - Has no mathematical, musical, or traditional riddle elements - Can be experienced fully regardless of vision status - Creates interesting roleplaying opportunities as players discuss how different scents "feel" texturally

For the DM, it's easy to describe and implement while still being conceptually unique. Players solve it through discussion, exploration, and experimentation rather than recalling common puzzle patterns.

Herring|10 months ago

In my experience Gemini can be really good at creative writing, but yes you have to prompt and edit it very carefully (feeding ideas, deleting ideas, setting tone, conciseness, multiple drafts, etc).

https://old.reddit.com/r/singularity/comments/1andqk8/gemini...

CuriouslyC|10 months ago

I use Gemini pretty much exclusively for creative writing largely because the long context lets you fit an entire manuscript plus ancillary materials, so it can serve as a solid beta reader, and when you ask it to outline a chapter it is very good at taking the events preceding and following into account. It's hard to overstate the value of having a decent beta reader that can iteratively review your entire work in seconds.

As a side note, I find the way that you interact with a LLM when doing creative writing is generally more important than the model. I have been having great results with LLMs for creative writing since ChatGPT 3.5, in part because I approach the model with a nucleus of a chapter and a concise summary of relevant details, then have it ask me a long list of questions to flesh out details, then when the questions stop being relevant I have have it create a narrative outline or rough draft which I can finish.

expensive_news|10 months ago

I have mixed feelings. Generally I don’t think that LLM output should be used to create anything that a human is supposed to read, but I do carve out a big exception for people using LLMs for translation/writing in a second language.

At the same time, however, the people who need to use an LLM for this are going to be the worst at identifying the output’s weaknesses, eg just as I couldn’t write Spanish text, I also couldn’t evaluate the quality of a Spanish translation that an LLM produced. Taken to an extreme, then, students today could rely on LLMs, trust them without knowing any better, and grow to trust them for everything without knowing anything, never even able to evaluate their quality or performance.

The one area that I do disagree with the author, though, is coding. As much as I like algorithms code is written to be read by computers and I see nothing wrong with computers writing it. LLMs have saved me tons of time writing simple functions so I can speed through a lot of the boring legwork in projects and focus on the interesting stuff.

I think Miyazaki said it best: “I feel… humans have lost confidence“. I believe that LLMs can be a great tool for automating a lot of boring and repetitive work that people do every day, but thinking that they can replace the unique perspectives of people is sad.

scsh|10 months ago

I actually feel very strongly that code is very much written for us humans. Sure, it's a set of instructions that is intended to be machine read and executed but so much of _how_ code is written is very much focused on the human element that's been a part of software development. OOP, design patterns, etc. don't exist because there is some great benefit to the machines running the code. We humans benefit as the ones maintaining and extending the functionality of the application.

I'm not making a judgement about the use of LLMs for writing code, just that I do think that code serves the purpose of expressing meaning to machines as well as humans.

johnnyanmac|10 months ago

>As much as I like algorithms code is written to be read by computers and I see nothing wrong with computers writing it.

unless you're the sole contributor, code is a collaborative effort and will be reviewed by peers to make sure you don't hit any landmines at best, or ruin the codebase at worst. unless you're writing codegen itself I very much would consider writing code as if a human is going to read it.

>“I feel… humans have lost confidence“

Confidence in their fellow man? yes. As the author said a lot of this reliance on AI without proper QA comes down to "nobody cares". Or at least that mentality. And apathy is just as contagious in an environment as passion. If we lose that passion and are simply doing a task to get by and clock out, we're doomed as a species.

sigotirandolas|10 months ago

For creative and professional writing, I found them useful for grammar and syntax review, or finding words from a fuzzy description.

For the structure, they are barely useful: Writing is about having such a clear understanding, that the meaning remains when reduced to words, so that others may grasp it. The LLM won't help much with that, as you say yourself.

kergonath|10 months ago

> I'd love to hear any counterpoints from folks who have used LLMs lately to get academic or creative writing done

They’re great at proofreading. They’re also good at writing conclusions and abstracts for articles, which is basically synthesising the results of the article and making it sexy (a task most scientists are hopelessly terrible at). With caveats:

- all the information needs to be in the prompt, or they will hallucinate;

- the result is not good enough to submit without some re-writing, but more than enough to get started and iterate instead of staring at a blank screen.

I want to use them to write methods sections, because that is basically the exact same information repeated in every article, but the actual sentences need to be different each time. But so far I don’t trust them to be accurate with technical details. They’re language models, they have no knowledge or understanding.

ziotom78|10 months ago

Point two is critical. I have found that the best way for me is to avoid using copy-and-paste. Instead, I put the browser on the right corner of the screen and my text editor on the left, then transcribe the text word by word by typing it using the keyboard. In this way, my natural laziness is less likely to accept words, expressions, and sentences that are perhaps okay-ish but not 100% following my taste.

riknos314|10 months ago

100% agree.

LLMs may seem like magic buy they aren't. They operate within the confines of the context they're given. The more abstract the context, the more abstract the results.

I expect to need to give a model at least as much context as a decent intern would require.

Often asking the model "what information could I provide to help you produce better code" and then providing said information leads to vastly improved responses. Claude 3.7 sonnet in Cline is fairly decent at asking for this itself in plan mode.

More and more I find that context engineering is the most important aspect of prompt engineering.

jes5199|10 months ago

I use an LLM to brainstorm for a creative writing project. Mostly I ignore its suggestions! but, somehow having the chatter helps me see what I am trying to say

altilunium|10 months ago

Sometimes, good writing is like an NP-complete problem, hard to create, but easy to verify. If you have enough skill to distinguish good output from garbage, you can produce reasonably good results.

hk__2|10 months ago

> Sometimes, good writing is like an NP-complete problem, hard to create, but easy to verify.

Doesn’t this match pretty much all human creation? It’s easier to judge a book that to write it, it’s easier to watch a rocket going up in the space than to build it, it’s easier to appreciate some Renaissance painting or sculpture than to actually make it.

echelon|10 months ago

> I'd love to hear any counterpoints from folks who have used LLMs lately to get academic or creative writing done

I commented in another thread. We're using image and video diffusion models for creative:

https://www.youtube.com/watch?v=H4NFXGMuwpY

Still not a fan of LLMs.

buu700|10 months ago

I think the author has a fair take on the types of LLM output he has experience with, but may be overgeneralizing his conclusion. As shown by his example, he seems to be narrowly focusing on the use case of giving the AI some small snippet of text and asking it to stretch that into something less information-dense — like the stereotypical "write a response to this email that says X", and sending that output instead of just directly saying X.

I personally tend not to use AI this way. When it comes to writing, that's actually the exact inverse of how I most often use AI, which is to throw a ton of information at it in a large prompt, and/or use a preexisting chat with substantial relevant context, possibly have it perform some relevant searches and/or calculations, and then iterate on that over successive prompts before landing on a version that's close enough to what I want for me to touch up by hand. Of course the end result is clearly shaped by my original thoughts, with the writing being a mix of my own words and a reasonable approximation of what I might have written by hand anyway given more time allocated to the task, and not clearly identifiable as AI-assisted. When working with AI this way, asking to "read the prompt" instead of my final output is obviously a little ridiculous; you might as well also ask to read my browser history, some sort of transcript of my mental stream of consciousness, and whatever notes I might have scribbled down at any point.

palata|10 months ago

> the exact inverse of how I most often use AI, which is to throw a ton of information at it in a large prompt

It sounds to me that you don't make the effort to absorb the information. You cherry-pick stuff that pops in your head or that you find online, throw that into an LLM and let it convince you that it created something sound.

To me it confirms what the article says: it's not worth reading what you produce this way. I am not interested in that eloquent text that your LLM produced (and that you modify just enough to feel good saying it's your work); it won't bring me anything I couldn't get by quickly thinking about it or quickly making a web search. I don't need to talk to you, you are not interesting.

But if you spend the time to actually absorb that information, realise that you need to read even more, actually make your own opinion and get to a point where we could have an actual discussion about that topic, then I'm interested. An LLM will not get you there, and getting there is not done in 2 minutes. That's precisely why it is interesting.

satisfice|10 months ago

If you present your AI-powered work to me, and I suspect you employed AI to do any of the heavy lifting, I will automatically discount any role you claim to have had in that work.

Fairly or unfairly, people (including you) will inexorably come to see anything done with AI as ONLY done with AI, and automatically assume that anyone could have done it.

In such a world, someone could write the next Harry Potter and it will be lost in a sea of one million mediocre works that roughly similar. Hidden in plain sight forever. There would no point in reading it, because it is probably the same slop I could get by writing a one paragraph prompt. It would be too expensive to discover otherwise.

nojs|10 months ago

> it's that the thought that went into making the prompt is, inevitably, more interesting/original/human than the output the LLM generates afterwards

I think you are overestimating the people who submit this slop. It’s more like “here’s my assignment, what’s the answer”