top | item 40670705

(no title)

I've been experimenting with basic (free, COTS) AIs to generate requirements and then code based on those requirements, typically python or php.

The results have been dismal. Fine-tuning prompts is a good exercise for requirements gathering, complete with the tedium that we all love. Sadly none of the AIs that I used were capable of remembering the current requirements with any degree of confidence. ChatGPT4 does a pretty good job until it forgets everything randomly. Both it and CoPilot failed to remember simple instructions like "Please don't send the updated requirements until I ask for them." I assume so that I can waste tokens?

For code creation, both AIs consistently failed to include features that were in the requirements and in pseudo code that they provided and that I approved.

I did enjoy the change of pace of prompt engineering. It's fun while it's happening and while the AI behaves. But it gets very old very quickly saying things like:

Me: We just added <some requirement>, but I don't see it in the most recent version of the requirements.

It: Sorry Dave, you're right. Here's the latest requirements:

Hallucinations abound, not just adding things at random, but forgetting things, sometimes with a great deal of resistance to incorporating the forgotten thing.

Also, once you do get to code, don't expect to fine tune it. If you see a bug or oversight in AI-provided code and point it out, it won't correct the code. Instead it appears to re-generate it from scratch. Will there be new, unasked for features? Could be. What about the correction -- is it fixed? Maybe. And maybe in a completely new way.

I have found some success with CoPilot and basic research. In essence it's a smarter search engine. So prompts like this can result in useful leads:

I've got some data that has <a description of some feature in the data>.

Questions:

What is the correct term for <some feature>?

What python library would you suggest to help me analyze <the feature>?

It's quite possible that my prompt-fu is simply weak and ignorant. And using COTS free-ish AIs I'm getting what I pay for. But I would say that as far as using an AI as an all-purpose junior goes, we're not there yet.

discuss

No comments yet.