top | item 43351785

(no title)

datadeft | 11 months ago

The biggest problem what I have with using AI for software engineering is that it is absolutely amazing for generating the skeleton of your code, boilerplate really and it sucks for anything creative. I have tried to use the reasoning models as well but all of them give you subpar solutions when it comes to handling a creative challenge.

For example: what would be the best strategy to download 1000s of URLs using async in Rust. It gives you ok solutions but the final solution came from the Rust forum (the answer was written 1 year ago) which I assume made its way into the model.

There is also the verbosity problem. Calude without the concise flag on generates roughly 10x the required amount of code to solve a problem.

Maybe I am prompting incorrectly and somehow I could get the right answers from these models but at this stage I use these as a boilerplate generator and the actual creative problem solving remains on the human side.

discuss

order

gazereth|11 months ago

Personally I've found that you need to define the strategy yourself, or in a separate prompt, and then use a chain-of-thought approach to get to a good solution. Using the example you gave:

  Hey Chat,
  Write me some basic rust code to download a url. I'd like  to pass the url as an string argument to the file
Then test it and expand:

  Hey Chat,
  I'd like to pass a list of urls to this script and fetch them one by one. Can you update the code to accept a list of urls from a file?

Test and expand, and offer some words of encouragement:

  Great work chat, you're really in the zone today!

  The downloads are taking a bit too long, can you change the code so the downloads are asynchronous. Use the native/library/some-other-pattern for the async parts.

Test and expand...

hypeatei|11 months ago

Whew, that's a lot to type out and you have to provide words of encouragement? Wouldn't it make more sense to do a simple search engine query for a HTTP library then write some code yourself and provide that for context when doing more complicated things like async?

I really fail to see the usefulness in typing out long winded prompts then waiting for information to stream in. And repeat...

ahofmann|11 months ago

I'm going the exact opposite way. I provide all important details in the prompt and when I see that the LLM understood something wrong, I start over and add the needed information to the prompt. So the LLM either gets it on the first prompt, or I write the code myself. When I get the "Yes, you are right ..." or "now I see..." crap, I throw everything away, because I know that the LLM will only find shit "solutions".

hakaneskici|11 months ago

I have heard a few times that "being nice" to LLMs sometimes improves their output quality. I find this hard to believe, but happy to hear your experience.

Examples include things like, referring to LLM nicely ("my dear"), saying "please" and asking nicely, or thanking.

Do these actually work?

tmpz22|11 months ago

I find it really bad for bootstrapping projects such as picking dependencies from rapidly evolving ecosystems or understanding the more esoteric constraints like sqlite's concurrency model.

I'd argue you need to bootstrap and configure your project then allow only narrow access and problems to the llm to write code for - individual functions where your prompt includes the signature, individual tests, etc. Anything else and you really need to invest time in the code review lest they re-configure some of your code in a drastic way.

LLMs are useful but they do not replace procedure.

MortyWaves|11 months ago

I agree completely with all you said however Claude solved a problem I had recently in a pretty surprising way.

So I’m not very experienced with Docker and can just about make a Docker Compose file.

I wanted to setup cron as a container in order to run something on a volume shared with another container.

I googled “docker compose cron” and must have found a dozen cron images. I set one up and it worked great on X86 and then failed on ARM because the image didn’t have an ARM build. This is a recurring theme with Docker and ARM but not relevant here I guess.

Anyway, after going through those dozen or so images all of which don’t work on ARM I gave up and sent the Compose file to Claude and asked it to suggest something.

It suggested simply use the alpine base image and add an entry to its crontab, and it works perfectly fine.

This may well be a skill issue but it had never occurred to me to me that cron is still available like that.

Three pages of Google results and not a single result anywhere suggesting I should just do it that way.

Of course this is also partly because Google search is mostly shit these days.

noisy_boy|11 months ago

Maybe you would have figured it out if you thought a bit more deeply about what you wanted to achieve.

You want to schedule things. What is the basic tool we use to schedule on Linux? Cron. Do you need to install it separately? No, it usually comes with most Linux images. What is your container, functionally speaking? A working Linux system. So you can run scripts on it. Lot of these scripts run binaries that come with Linux. Is there a cron binary available? Try using that.

Of course, hindsight is 20/20 but breaking objectives down to their basic core can be helpful.

sgarland|11 months ago

With respect, the core issue here is you lacked a basic understanding of Linux, and this is precisely the problem that many people — including myself – have with LLMs. They are powerful and useful tools, but if you don’t understand the fundamentals of what you’re trying to accomplish, you’re not going to have any idea if you’re going about that task in the correct manner, let alone an optimal one.

noisy_boy|11 months ago

For Claude, set up a custom prompt which should have whatever you want + this:

"IMPORTANT: Do not overkill. Do not get distracted. Stay focused on the objective."

lfsh|11 months ago

As I understand 'reasoning' is a very misleading term. As far as I can tell, AI reasoning is a step to evaluate the chosen probabilities. So maybe you will get less hallucinations but it still doesn't make AI smart.

Sohcahtoa82|11 months ago

Yeah, "reasoning" just tells the AI to take an extra planning step.

In my experience, before "reasoning" became an option, if you ask it a question that takes a decent amount of thinking to solve, but also tell the model "Just give me the answer", you're FAR more likely to get an incorrect answer.

So "reasoning" just tells the model to first come up with a plan to solve a problem before actually solving it. It generates its own context for coming up with a more complete solution.

"Planning" would be a more accurate term for what LLMs are doing.

heap_perms|11 months ago

What I also notice is that the very easily get stuck on a specific approach to solving a problem. One prompt that has been amazing for this is this:

> Act as if you're and outside observer to this chat so far.

This really helps in a lot of these cases.

TeMPOraL|11 months ago

Like, dropping this in the middle of the conversation to force the model out of a "local minimum"? Or restarting the chat with that prompt? I'm curious how you use it to make it more effective.

MortyWaves|11 months ago

That’s a cool tip; I usually just give up and start a new chat.

benhurmarcel|11 months ago

I find them very good for debugging also