top | item 46408215

(no title)

superze | 2 months ago

As an Opus user, I genuinely don’t understand how someone can work for weeks or months without regularly opening an IDE. The output almost always fails.

I repeatedly rewrite prompts, restate the same constraints, and write detailed acceptance criteria, yet still end up with broken or non-functional code.its very frustrating to say the least Yesterday alone I spent about $200 on generations that now require significant manual rewrites just to make them work.

At that point, the gains are questionable. My biggest success is having the model take over the first Design in my app and I take it from there, but those hundred lines if not thousand lines of code it generates are so Messi, it's insanely painful to refactor the mess afterwards

discuss

throwatdem12311|2 months ago

I have a hell of a time just getting any LLM to write SQL queries that have things like window functions, aggregates and lateral left joins - even when shoving the entire database schema DDL into the context.

It's so frustrating, it regularly makes me want to just quit the profession. Which is why I still just write most code by hand.

data-ottawa|2 months ago

I write a lot of SQL and I haven't had these issues for months, even with smaller models. Opus can one shot most of my queries faster than I could type them.

Instead of stuffing the context with DDL I suggest:

1. Reorganize your data warehouse. It needs to be easy to find the correct data. Make sure you use ELT clear layers, meaningful schemas, and have per-model documentation. This is a ton of work, but if done right the payoff is massive.

2. I built a tool for myself to pull our warehouse into a graph for fuzzy search+dependency chain analysis. In the spring I made an MCP server for it and Claude uses that tool incredibly well for almost all queries. I haven't actually used the GUI or scripts since I built the MCP.

Claude and Devstral are the best models I've used for SQL. I cannot get Gemini to write decent modern sql -- even the Gemini data science/engineer agents in Google Cloud. I occasionally try the paid models through the API and still haven't been impressed.

deadbabe|2 months ago

If you really know SQL, writing an SQL query basically just feels like writing a prompt for a database client anyway, except it does exactly what you ask for.

SkyPuncher|2 months ago

My trick is to explicitly roll play that we’re doing a spike. This gets all of the models to ignore all of the details they normally get hung up on. Once I have the basics in place, I can tell it to fix details.

It’s _always_ easier to add more code than it is to fix broken code.

nowittyusername|2 months ago

Most people have not fully grasped how LLM's work and how to properly utilize agentic coding solutions. That is the reason for issues when it comes to vibe coders having low quality code. But that is not the limitation of technology but the user (at this stage). Basically think of it this way everyone is the grandma that has been handed a palm pilot to use to get things done. Grandma needs an iPhone not a palm pilot but the problem is that we are not in that territory yet. So now consider the people who were able to use the palm pilot very successfully and well, they were few and they were the exception, but they existed. Same here. I have been using coding agent for over 7 months now and have written zero lines of code, in fact I don't know how to code at all. But i have been able to architect very complex software projects from scratch. Text to speech , automated llm benchmarking systems for testing all possible llama.cpp sampling parameters and more, and now im building my own agentic framework from scratch. All of these things are possible and more without writing one line of code yourself. But it does require understanding how to use the technology well to get this done.

mirsadm|2 months ago

If you don't know how to code then you are not able to judge what your producing accurately.

krior|2 months ago

All of the applications you mention could be scoped as beginner projects. I don't think they represent good proofs of capability.

shepherdjerred|2 months ago

I hardly ever open an IDE anymore.

I use Claude Code and Cursor. What I do:

- use statically typed languages: TypeScript, Go, Rust, Python w/ types

- Setup linters. For TS I have a bunch of custom lint rules (authored by AI) for common feedback that I've given. (https://github.com/shepherdjerred/monorepo/tree/main/package...)

- For Cursor, lots of feedback on my desired style. https://github.com/shepherdjerred/scout-for-lol/tree/main/.c...

- Heavy usage of plan mode. Tell AI something like "make at least 20 searches to online documentation", support every claim with a reference, etc. Tell AI "make a task for every little thing you'll implement"

- Have the AI write tests, particularly the more expensive ones like integration and end-to-end, so you have an easy way to verify functionality.

- Setup Claude Code GHA to automatically review PRs. Give the review feedback to the agent that implemented it, either via copy-pasting or tell the agent "fetch review comments and fix them".

Some examples of what I've made:

- Many features for https://scout-for-lol.com/, a League of Legends bot for Discord

- A program to generate TypeScript types for Helm charts (https://github.com/shepherdjerred/homelab/tree/main/src/helm...)

- A program to summarize all of the dependency updates for my Homelab (https://github.com/shepherdjerred/homelab/tree/main/src/deps...)

- A program to manage multiple instances of CLI agents like Claude Code (https://github.com/shepherdjerred/monorepo/tree/main/package...)

- A Discord AI bot in the style of my friends (https://github.com/shepherdjerred/monorepo/tree/main/package...)

moffkalast|2 months ago

> make at least 20 searches to online documentation

Lol sometimes I have to spend two turns convincing Claude to use its goddamn search and look up the damn doc instead of trying to shoot from the hip for the fifth time. ChatGPT at least has forced search mode.

throw2312321|2 months ago

Thanks for sharing. So the dumb question - do you feel like Claude Code & Cursor have made you significantly more productive? You have an impressive list of personal projects, and I can see how a power user of AI tools can be very effective with green field projects. Does the productivity boost translate as well to your day job?

BhavdeepSethi|2 months ago

Cursor is an IDE.

miguel_martin|2 months ago

This is what an AGENTS.md - https://agents.md/ (or CLAUDE.md) file is for. Put common constraints to correct model mistakes/issues with respect to the codebase, e.g. in a “code style” section.

tmaly|2 months ago

What does your software creation workflow look like? Do you have a design phase?

falcor84|2 months ago

Why would you spend $200 a day on Opus if you can pay that for a month via the highest tier Claude Max subscription? Are you using the API in some special way?

jefffoster|2 months ago

At a guess an Enterprise API account. Pay per token but no limits.

It’s very easy to spend $100s per dev per day.

christophilus|2 months ago

I’ve had decent results from it. What programming language are you using?

unknown|2 months ago

[deleted]

cloudflare728|2 months ago

Sometimes I have a similar file or related files. I copy their names and say use them as reference. Code quality improves by 10 times if you do so. Even providing a a example from framework's getting started works great too for new project.

Yeah the pain of cleaning up small mess is great too. I had some tests failing and type failing issues, I thought I will fix it later by only using AI prompt. As the size was growing, failing Typescript issues was growing too. At some point it was 5000+ type issues and countless number of failing unit tests. Then more and more. I tried to fix with AI, since it was not possible fixing old way. Then I discarded the whole project when it was around 500k lines of code.

pca006132|2 months ago

Question: How many LoC do you let the AI write for each iteration? And do you review that? It sounds like you are letting it run off leash.