top | item 46654417

(no title)

petekoomen | 1 month ago

I'm seeing a lot of negativity in the comments. Here's why I think this is actually a Good Idea. Many command line tools rely on something like this for installation:

  $ curl -fsSL https://bun.com/install | bash
This install script is hundreds of lines long and difficult for a human to audit. You can ask a coding agent to do that for you, but you still need to trust that the authors haven't hidden some nefarious instructions for an LLM in the middle of it.

On the other hand, an equivalent install.md file might read something like this:

Install bun for me.

Detect my OS and CPU architecture, then download the appropriate bun binary zip from GitHub releases (oven-sh/bun). Use the baseline build if my CPU doesn't support AVX2. For Linux, use the musl build if I'm on Alpine. If I'm on an Intel Mac running under Rosetta, get the ARM version instead.

Extract the zip to ~/.bun/bin, make the binary executable, and clean up the temp files.

Update my shell config (.zshrc, .bashrc, .bash_profile, or fish http://config.fish depending on my shell) to export BUN_INSTALL=~/.bun and add the bin directory to my PATH. Use the correct syntax for my shell.

Try to install shell completions. Tell me what to run to reload my shell config.

It's much shorter and written in english and as a user I know at a glance what the author is trying to do. In contrast with install.sh, install.md makes it easy for the user to audit the intentions of the programmer.

The obvious rebuttal to this is that if you don't trust the programmer, you shouldn't be installing their software in the first place. That is, of course, true, but I think it misses the point: that coding agents can act as a sort of runtime for prose and as a user the loss in determinism and efficiency that this implies is more than made up for by the gain in transparency.

discuss

order

cuu508|1 month ago

IMO it's completely the other way around.

Shell scripts can be audited. The average user may not do it due to laziness and/or ignorance, but it is perfectly doable.

On the other hand, how do you make sure your LLM, a non-deterministic black box, will not misinterpret the instructions in some freak accident?

nobodywillobsrv|1 month ago

How about both worlds?

Instead of asking the agent to execute it for you, you ask the agent to write an install.sh based on the install.md?

Then you can both audit whatever you want before running or not.

jedwhite|1 month ago

Thanks for posting the original ideas that led to all this. "Runtime for prose" is the new "literate programming" - early days but a pointer to some pretty cool future things, I think.

It's already made a bunch of tasks that used to be time-consuming to automate much easier for me. I'm still learning where it does and doesn't work well. But it's early days.

You can tell something is a genuinely interesting new idea when someone posts about it on X and then:

1. There are multiple launches on HN based on the idea within a week, including this one.

2. It inspires a lot of discussion on X, here and elsewhere - including many polarized and negative takes.

Hats off for starting a (small but pretty interesting) movement.

smaudet|1 month ago

> This install script is hundreds of lines long

Any script can be shortened by hiding commands in other commands.

LLMs run parameters in the billions.

Lines of code, as usual, is an incredibly poor metric to go by here.

petekoomen|1 month ago

My point is not that LLMs are inherently trustworthy. It is that a prompt can make the intentions of the programmer clear in a way that is difficult to do with code because code is hard to read, especially in large volumes.

jen20|1 month ago

This seems like an incredibly long winded, risky and inefficient way to install bun.

I've never actually (knowingly) run Bun before, but decided to give it a try - below is my terminal session to get it running (on macOS):

    $ nix-shell -p bun
    
    [nix-shell:~]$ bun
    Bun is a fast JavaScript runtime, package manager, bundler, and test
    runner. (1.3.5+1e86cebd7)
    
    Usage: bun <command> [...flags] [...args]
    
    Commands:
      run       ./my-script.ts       Execute a file with Bun
                lint                 Run a package.json script
    ... (rest of output trimmed)...

(Edited to wrap a long preformatted line)

catlifeonmars|1 month ago

This seems less auditable though, because now there is more variability in the way something is installed. Now there are two layers to audit:

- What the agent is told to do in prose

- How the agent interprets those instructions with the particular weights/contexts/temperature at the moment.

I’m all for the prose idea, but wouldn’t want to trade determinism for it. Shell scripts can be statically analyzed. And also reviewed. Wouldn’t a better interaction be to use an LLM to audit the shell script, then hash the content?

petekoomen|1 month ago

Yes, this approach (substituting a markdown prompt for a shell script) introduces an interesting trade-off between "do I trust the programmer?" and "do I trust the LLM?" I wouldn't be surprised to see prompt-sharing become the norm as LLMs get better at following instructions and people get more comfortable using them.

PunchyHamster|1 month ago

you assume 2 things: that the instructions will be followed correctly, and that the way they will be followed won't change with agent change

Neither of those things is actually true

People that got their home dir removed by AI agent did not ask for their home dir being removed by AI

blast|1 month ago

Why the specific application to install scripts? Doesn't your argument apply to software in general?

(I have my own answer to this but I'd like to hear yours first!)

petekoomen|1 month ago

It does, and possibly this launch is a little window into the future!

Install scripts are a simple example that current generation LLMs are more than capable of executing correctly with a reasonably descriptive prompt.

More generally, though, there's something fascinating about the idea that the way you describe a program can _be_ the program that tbh I haven't fully wrapped my head around, but it's not crazy to think that in time more and more software will be exchanged by passing prompts around rather than compiled code.

patmorgan23|1 month ago

How is asking an LLM to make some random install script up better than a script designed by the application developer?

The install.sh is auditable, yes you need to know bash to be able to audit it, but the same is true for an LLM, it could hallucinate random commands that delete files or override other applications/configs.

Szpadel|1 month ago

imagine such support ticket:

I used minimax M2 (context it's very unreliable) for installation and it didn't work and my document folder is missing, help

how do you even debug this? imagine you some path or behaviour is changed in new os release and model thinks it knows better? if anything goes wrong who is responsible?

chme|1 month ago

Maybe that is a reason for this approach. It changes the responsibility of errors from the person writing that code, to the one executing it.

Pretty brilliant in a way.