top | item 45336775

Pairing with Claude Code to rebuild my startup's website

178 points| nadis | 5 months ago |blog.nseldeib.com

136 comments

order
[+] friggeri|5 months ago|reply
Looking at the prompts op has shared, I'd recommend more aggressively managing/trimming the context. In general you don't give the agent a new task without /clearing the context before. This will enable the agent to be more focused on the new task, and decrease its bias (if eg. reviewing changes it has made previously).

The overall approach I now have for medium sized task is roughly:

- Ask the agent to research a particular area of the codebase that is relevant to the task at hand, listing all relevant/important files, functions, and putting all of this in a "research.md" markdown file.

- Clear the context window

- Ask the agent to put together a project plan, informed by the previously generated markdown file. Store that project plan in a new "project.md" markdown file. Depending on complexity I'll generally do multiple revs of this.

- Clear the context window

- Ask the agent to create a step by step implementation plan, leveraging the previously generated research & project files, put that in a plan.md file.

- Clear the context window

- While there are unfinished steps in plan.md:

-- While the current step needs more work

--- Ask the agent to work on the current step

--- Clear the context window

--- Ask the agent to review the changes

--- Clear the context window

-- Ask the agent to update the plan with their changes and make a commit

-- Clear the context window

I also recommend to have specialized sub agents for each of those phases (research, architecture, planning, implementation, review). Less so in terms of telling the agent what to do, but as a way to add guardrails and structure to the way they synthesize/serialize back to markdown.

[+] devingould|5 months ago|reply
I pretty much never clear my context window unless I'm switching to entirely different work, seems to work fine with copilot summarizing the convo every once in a while. I'm probably at 95% code written by an llm.

I actually think it works better that way, the agent doesn't have to spend as much time rereading code it had previously just read. I do have several "agents" like you mention, but I just use them one by one in the same chat so they share context. They all write to markdown in case I do want to start fresh if things do go the wrong direction, but that doesn't happen very often.

[+] lukaslalinsky|5 months ago|reply
Even better approach, in my experience is to ask CC to do research, then plan work, then let it implement step 1, then double escape, move back to the plan, tell it that step 1 was done and continue with step 2.
[+] righthand|5 months ago|reply
You just convinced me that Llms are a pay-to-play management sim.
[+] R0m41nJosh|5 months ago|reply
I have been reluctant to use AI as a coding assistant though I have installed claude code and bought a bunch of credits. When I see comments like this I genuinely asking what's the point. Are you sure that going through all of these manipulation instead of directly editing the source code makes you more productive? In which way?

Not trolling, true question.

[+] giancarlostoro|5 months ago|reply
> Looking at the prompts op has shared, I'd recommend more aggressively managing/trimming the context. In general you don't give the agent a new task without /clearing the context before. This will enable the agent to be more focused on the new task, and decrease its bias (if eg. reviewing changes it has made previously).

My workflow for any IDE, including Visual Studio 2022 w/ CoPilot, JetBrains AI, and now Zed w/ Claude Code baked in is to start a new convo altogether when I'm doing something different, or changing up my initial instructions. It works way better. People are used to keeping a window until the model loses its mind on apps like ChatGPT, but for code, the context Window gets packed a lot sooner (remember the tools are sending some code over too), so you need to start over or it starts getting confused much sooner.

[+] nadis|5 months ago|reply
OP here, this is great advice. Thanks for sharing. Clearing context more often between tasks is something I've started to do more recently, although definitely still a WIP to remember to do so. I haven't had a lot of success with the .md files leading to better results yet, but have only experimented with them occasionally. Could be a prompting issue though, and I like the structure you suggested. Looking forward to trying!

I didn't mention it in the blog post but actually experimented a bit with using Claude Code to create specialized agents such as an expert-in-Figma-and-frontend "Design Engineer", but in general found the results worse than just using Claude Code as-is. This also could be a prompting issue though and it was my first attempt at creating my own agents, so likely a lot of room to learn and improve.

[+] enraged_camel|5 months ago|reply
This is overkill. I know because I'm on the opposite end of the spectrum: each of my chat sessions goes on for days. The main reason I start over is because Cursor slows down and starts to stutter after a while, which gets annoying.
[+] antihero|5 months ago|reply
This sounds like more effort than just writing the code.
[+] jimbo808|5 months ago|reply
This seems like a bit of overkill for most tasks, from my experience.
[+] jngiam1|5 months ago|reply
I also ask the agent to keep track of what we're working on in a another md file which it save/loads between clears.
[+] raducu|5 months ago|reply
In 2025, does it make any difference to tel the LLM "you're an expert/experienced engineer?"
[+] 1oooqooq|5 months ago|reply
and it will completely ignoring the instructions because user input cannot afect it, but it will waste even more context space to fool you that it did.
[+] shafyy|5 months ago|reply
Dude, why not just do it yourself if you have to micromanage the LLM this hardcore?
[+] opto|5 months ago|reply
> I wasted several hours on occasions where Claude would make changes to completely unrelated parts of the application instead of addressing my actual request.

Every time I read about people using AI I come away with one question. What if they spent hours with a pen and paper and brainstormed about their idea, and then turned it into an actual plan, and then did the plan? At the very least you wouldn't waste hours of your life and instead enjoy using your own powers of thought.

[+] kerpal|5 months ago|reply
The product/website itself is interesting as a founder who believes heavily in implementing simulations to rigourously test complex systems. However I noticed lots of screenshots and less substance about how it actually works. If your ICP is technical, the frontend and marketing shouldn't be overdone IMO.

I need substance and clear explanations of models, methodology, concepts with some visual support. Screenshots of the product are great but a quick real or two showing different examples or scenarios may be better.

I'm also skeptical many people who are already technical and already using AI tools will now want to use YOUR tool to conduct simulation based testing instead of creating their own. The deeper and more complex the simulation, the less likely your tool can adapt to specific business models and their core logic.

This is party of the irony of AI and YC startups, LOTS of people creating this interesting pieces of software with AI when part of the huge moat that AI provides is being able to more quickly create your own software. As it evolves, the SaaS model may face serious trouble except in the most valuable (e.g. complex and/or highly scalable) solutions already available with good value.

However simulations ARE important and they can take a ton of time to develop or get right, so I would agree this could be an interesting market if people give it a chance and it's well designed to support different stacks and business logic scenarios.

[+] smjburton|5 months ago|reply
> Since our landing page is isolated from core product code, the risk was minimal. That said, I was constantly sanity-checking what Claude was changing. If I ever “vibed” too hard and lost focus, Claude would sometimes change the wrong files.

> Still, I wouldn’t trust Claude, or any AI agent, to touch production code without close human oversight.

My experience has been similar, and it's why I prefer to keep LLMs separate from my code base. It may take longer than providing direct access, but I find it leads to less hidden/obscure bugs that can take hours (and result in a lot of frustration) to fix.

[+] nadis|5 months ago|reply
> My experience has been similar, and it's why I prefer to keep LLMs separate from my code base.

I'm curious how you're managing this - is it primarily by inputting code snippets or abstract context into something like a Claude or ChatGPT?

I found for myself that I usually was bad at providing sufficient context when trying to work with the LLM separately from the codebase, but also might lack the technical background or appropriate workflow.

[+] qsort|5 months ago|reply
"Proceed with caution" seems to be the overwhelming consensus, at least with models having this level of capability. I commend the author for having the humility to recognize the limit of their capability, something we developers too often lack.
[+] nadis|5 months ago|reply
OP here. I really appreciate this comment, thank you. I am more and more aware of my limitations, and am working on prioritizing learning, both about how to build with AI agents but also how to build software more generally, in parallel to projects like this one.
[+] ojosilva|5 months ago|reply
Claude Code has been an awesome experience for me too... without ever subscribing to an Anthropic account!

I've never liked the free-tier Claude (Sonnet/Opus) chat sessions I've attempted with code snippets. Claude non-coding chat sessions were good, but I didn't detect anything magical about the model and the code it churned out for me to decide for a Claude Max Plan. Neither Cursor (I'm also a customer), with its partial use of Claude seemed that great. Maybe the magic is mostly in CC the agent...

So, I've been using a modified CC [1] with a modified claude-code-router [2] (on my own server), which exposes an Anthropic endpoint, and a Cerebras Coder account with qwen-3-coder-480b. No doubt Claude models+CC are well greased-out, but I think the folks in the Qwen team trained (distilled?) a coding model that is Sonnet-inspired so maybe that's the reason. I don't know. But the sheer 5x-10x inference speed of Cerebras makes up for any loss in quality from Sonnet or the FP8 quantization of qwen on the Cerebras side. If starting from zero every few agentic steps is the strategy to use, that with Cerebras is just incredible because it's ~ instantaneous.

I've tried my Cerebras Coder account with way too many coding agents, and for now CC, Cline (VS Code) and Qwen Code (a Gemini Code fork) are the ones that work best. CC beats the pack as it compresses the context just right and recovers well from Cerebras 429 errors (tpm limit), due to the speed (hitting ~1500 tps typically) clashing with Cerebras unreasonably tight request limits. When a 429 comes trough, CC just holds its breath a few seconds then goes at it again. Great experience overall!

[1] I've decompiled CC and modified some constants for Cerebras to fix some hickups

[2] had to remove some invalid request json keys sent by CC using CCR, and added others that were missing

[+] nadis|5 months ago|reply
This is super impressive to me!

> for now CC, Cline (VS Code) and Qwen Code (a Gemini Code fork) are the ones that work best

Thanks for sharing how you set this up, as well as which agents you've found work best.

I tried a handful before settling on CC (for now!) but there are so many new ones popping up and existing ones seem to be rapidly changing. I also had a good experience with Cline in VS Code, but not quite as good as CC.

Haven't tried Quen Code yet (I tried the Gemini CLI but had issues with the usability; the content would frequently strobe while processing which was a headache to look at).

[+] yojo|5 months ago|reply
Tell Claude to fix the scroll-blocker on the codeyam.com landing page.

This seems to be a bad practice LLMs have internalized; there should be some indication that there’s more content below the fold. Either a little bit of the next section peeking up, or a little down arrow control.

I vibe coded a marketing website and hit the same issue.

[+] nadis|5 months ago|reply
OP again - do you mind sharing what browser you're using? I'm looking into this now and am seeing a scroll bar on Chrome and Safari currently, so am wondering if it's a browser-specific bug or something else. Would love to figure out a fix and appreciate any additional info you can share on what you're seeing.
[+] ziml77|5 months ago|reply
Oh yeah that's really bad to not have any hints that there's more info. I suspect many people are going to hit that page and think there's nothing useful to see. (It's also partly the fault of browsers for not showing scroll bars, but that's the default so you need to design for it)
[+] nadis|5 months ago|reply
OP here, thanks for sharing the feedback. I'll investigate and good to know! I think this actually might be human error (my interpretation of the designs) rather than Claude's fault FWIW.
[+] yde_java|5 months ago|reply
Talking about coding websites: I'm a seasoned dev who loves to be in control of the website code, but hates debugging nitty gritty layouting issues, which steal tons of time and attention. I want to progress fast building great landing websites for my tech products, with the same speed that I code the products themselves. What stacks and LLM tools (if any) do you recommend that help writing great looking websites with great SEO support... fast?
[+] nadis|5 months ago|reply
I'm far from an expert, but I think depending on whether you have website designs or not already, you could use the Figma Dev Mode MCP Server + Claude Code as I did.

I've heard increasingly good things about Cursor and Codex, but haven't tried them as recently. Cline (as a VS Code extension) might also be helpful here.

If you need designs, something like v0 could work well. There are a ton of alternatives (Base44, Figma Make, etc.) but I've found v0 works the best personally, although it probably takes a bit of trial and error.

For SEO support specifically, I might just try asking some of the existing AI tooling to try to help you optimize there although I'm not sure how well the results would be. I briefly experimented with this and early results seemed promising, but did not push on it a lot.

[+] ta12653421|5 months ago|reply
Just pay the 20 USD for ClaudeAI for the beginning, then after 4 - 6 weeks check if you are happy.
[+] EdwardDiego|5 months ago|reply
> It was silly yet satisfying when Claude (the PR reviewer) agreed the change looked good and was ready to merge.

> "Approved - Ship It!” and ‘Great work on this!”

This pat on the head from an algorithm gives me the creeps, and I'm really struggling to put my finger on why.

Maybe it's because it's emulated approval, yet generating real feelings of pleasure in the author?

[+] ffsm8|5 months ago|reply
It's especially hilarious because ime, when it's drifting into this mode of constantly saying "production ready, ready to merge, ready to ship", the code it's producing is usually in a complete downward spiral - likely because it's no longer able to reason about the effect the "one little change" will have on the rest of the application.

Ive come to just terminate the session if a phrase like that turns up

Tbf to Nadia however, her comment supposedly came from the "code reviewer" agent? So the prompt might've explicitly asked it to make this statement and would (hopefully) not be reusing the context of the development (and neither the other way)

[+] idiotsecant|5 months ago|reply
It should open a little door in the side of your computer and drop a little piece of kibble to you.

"Good boy! Good PR!"

[+] sureglymop|5 months ago|reply
Design wise the website seems interesting and good. However, the "how it works" box instantly screamed AI to me because of the emoji list (which, at least to me, instinctively evokes a negative reaction).

However, I do applaud you being transparent about the AI use by posting it here.

[+] simonw|5 months ago|reply
"Initially, I did all my work locally. Meaning if anything had happened to my laptop, all my work would have been lost."

I run Dropbox on my laptop almost entirely as insurance against my laptop breaking or getting stolen before I've committed and pushed my work to git.

[+] bilater|5 months ago|reply
The way I work with Claude Code is to stage partial changes that I am happy with gradually so its easy to discard unstaged changes. It's a hack to mimic the keep all undo flow in Cursor Agent. Hopefully they can just have an easier way of reverting in future.
[+] nadis|5 months ago|reply
This is a helpful tip, thank you! I've started staging partial changes recently, but haven't gotten into a great workflow there and sometimes forget about them. Agreed on hopefully adding a better way to revert in the future!

It's been a bit since I tried Cursor and I may need to revisit that as well.

[+] b_e_n_t_o_n|5 months ago|reply
Neat. Although I get the feeling you're more technical than you give yourself credit for. I gotta try the Figma MCP server and see if it can generate HTML and styles, as that's the most boilerplaty part of front end.
[+] indigodaddy|5 months ago|reply
Nice looking page! One note, all the images seem to be right-justified (on my Android Chrome). That could be a good thing vs centered, but just thought I'd note it.
[+] 1oooqooq|5 months ago|reply
i don't even want to see which "website" it generated if it was creating canvases on every component.

also loved how in cto mode it went right away to "approve with minor comments" in the code review. this is too perfect in character.

[+] nadis|5 months ago|reply
There were definitely some silly Claude responses. CTO mode responses I shared with our (very human, very amazing) actual CTO who I think found it funny too.
[+] ademup|5 months ago|reply
For anyone else on the fence about moving to CLI: I'm really glad I did.

I am converting a WordPress site to a much leaner custom one, including the functionality of all plugins and migrating all the data. I've put in about 20 hours or so and I'd be shocked if I have another 20 hours to go. What I have so far looks and operates better than the original (according to the owner). It's much faster and has more features.

The original site took more than 10 people to build, and many months to get up and running. I will have it up single-handedly inside of 1 month, and it will have much faster load times and many more features. The site makes enough money to fully support 2 families in the USA very well.

My Stack: Old school LAMP. PHPstorm locally. No frameworks. Vanilla JS.

Original process: webchat based since sonnet 3.5 came out, but I used Gemini a lot after 2.5 pro came out, but primarily sonnet.

- Use Claude projects for "features". Give it only the files strictly required to do the specific thing I'm working on. - Have it read the files closely, "think hard" and make a plan - Then write the code - MINOR iteration if needed. Sometimes bounce it off of Gemini first. - the trick was to "know when to stop" using the LLM and just get to coding. - copy code into PHPStorm and edit/commit as needed - repeat for every feature. (refresh the claude project each time).

Evolution: Finally take the CLI plunge: Claude Code - Spin up a KVM: I'm not taking any chances. - Run PHPStorm + CC in the KVM as a "contract developer" - the "KVM developer" cannot push to main - set up claude.md carefully - carefully prompt it with structure, bounds, and instructions

- run into lots of quirks with lots of little "fixes" -- too verbose -- does not respect "my coding style" -- poor adherence to claude.md instructions when over half way through context, etc - Start looking into subagents. It feels like it's not really working? - Instead: I break my site into logical "features" -- Terminal Tab 1: "You may only work in X folder" -- Terminal Tab 2: "You may only work in Y folder". -- THIS WORKS WELL. I am finally in a "HOLY MOLLY, I am now unquestionably more productive territory!"

Codex model comes out - I open another tab and try it - I use it until I hit the "You've reached your limit. Wait 3 hour" message. - I go back to Claude (Man is this SLOW! and Verbose!). Minor irritation. - Go back to Codex until I hit my weekly limit - Go back to Claude again. "Oh wow, Codex works SO MUCH BETTER for me". - I actually haven't fussed with the AGENTS.md, nor do I give it a bunch of extra hand-holding. It just works really well by itself. - Buy the OpenAI PRO plan and haven't looked back.

I haven't "coded" much since switching to Codex and couldn't be happier. I just say "Do this" and it does it. Then I say "Change this" and it does it. On the rare occasions it takes a wrong turn, I simply add a coding comment like "Create a new method that does X and use that instead" and we're right back on track.

We are 100% at a point where people can just "Tell the computer what you want in a web page, and it will work".

And I am SOOOO Excited to see what's next.

[+] cloudking|5 months ago|reply
I run 4 to 8 Claude Codes in parallel daily, AMA
[+] phi-go|5 months ago|reply
Ask you anything? Well here it goes...

What languages do you use?

What kind of projects?

Do you maintain these projects or is this for greenfield development?

Could you fix any bugs without Claude?

Are these projects tested, who writes the tests. If it's Claude how do you know these tests actually test something sensible?

Is anybody using these projects and what do users think of using these projects?

[+] TheTaytay|5 months ago|reply
Nice. Do you "manage" them in any fancy way other than simply having multiple terminal windows open?
[+] nadis|5 months ago|reply
That's awesome, wow! Do you do anything to try to optimize / monitor token usage?
[+] jbs789|5 months ago|reply
Why?
[+] ccvannorman|5 months ago|reply
As someone who walked into 20k+ loc React/Next project, 95%+ vibecoded, I can say it's a relative nightmare to untangle the snarl of AI generated solutions. Particularly it is bad at separation of concerns and commingling the data. I found several places where there were in-line awaits for database objects, then db manipulations being done inline too, and I found them in the ux layer, the api layer, and even nested inside of other db repo files!

Someone once quipped that AI is like a college kid who studied a few programming courses, has access to all of stack overflow, lives in a world where hours go by in the blink of an eye, and has an IQ of 80 and is utterly incapable of learning.

[+] Zagreus2142|5 months ago|reply
I'm sorry but this article is marketing. From the 3rd paragraph from the end:

> Since our landing page is isolated from core product code, the risk was minimal.

The real question to ask is why your landing page so complex, it is a very standard landing page with sign-ups, pretty graphics, and links to the main bits of the website and not anything connected to a demo instance of your product or anything truly interactable.

Also, you claim this avoided you having to hire another engineer but you then reference human feedback catching the LLM garbage being generated in the repo. Sounds like the appropriate credit is shared between yourself, the LLM, and especially the developer who shepherded this behind the scenes.