(no title)
hmaxwell | 10 months ago
Prompt: "can you find any mistakes on my codebase? I put one in there on purpose" + 70,000 tokens of codebase where in one line I have an include for a non-existent file.
Claude 3.7 Sonnet (Try 1 - new chat) Claimed it Found a race condition in the api of the admin interface that would be triggered if two admins were to change the room order at the same time. Claude suggested I group all sql queries in a single transaction. I looked at the code and found that it already used a transaction for all queries. I said: the order_update api is already done with a transaction. Claude replied: "You're absolutely right, and I apologize for my mistake. I was incorrect to claim there was a race condition issue. The transaction ensures atomicity and consistency of the updates, and the SQL queries are properly structured for their intended purpose."
Claude 3.7 Sonnet (Try 2 - new chat) Found a comment in the admin ui javascript code that suggested a potential alternative to event handler cleanup that was not implemented because I decided to go with a cleaner route. Then asked "Is this the issue you intentionally included, or would you like me to look for other potential problems?" I said: "The comment merely suggests an alternative, right?" claude said: "Yes, you're absolutely right. The comment is merely suggesting an alternative approach that isn't being used in the code, rather than indicating a mistake. So there's no actual bug or mistake in this part of the code - just documentation of different possible approaches. I apologize for misinterpreting this as an issue!"
Claude 3.7 Sonnet (Try 3 - new chat) When processing items out of the database to generate QR codes in the admin interface, Claude says that my code both attempts to generate QR codes with undefined data AS WELL AS saying that my error handling skips undefined data. Claude contradicts itself within 2 sentences. When asking about clarification Claude replies: Looking at the code more carefully, I see that the code actually has proper error handling. I incorrectly stated that it "still attempts to call generateQRCode()" in the first part of my analysis, which was wrong. The code properly handles the case when there's no data-room attribute.
Gemnini Advanced 2.5 Pro (Try 1 - new chat) Found the intentional error and said I should stop putting db creds/api keys into the codebase.
Gemnini Advanced 2.5 Pro (Try 2 - new chat) Found the intentional error and said I should stop putting db creds/api keys into the codebase.
Gemnini Advanced 2.5 Pro (Try 3 - new chat) Found the intentional error and said I should stop putting db creds/api keys into the codebase.
o4-mini-high and o4-mini and o3 and 4.5 and 4o - "The message you submitted was too long, please reload the conversation and submit something shorter."
Tiberium|10 months ago
dyauspitr|10 months ago
danielbln|10 months ago
bambax|10 months ago
Ok but you don't need AI for this; almost any IDE will issue a warning for that kind of error...
rendang|10 months ago
airstrike|10 months ago
fandorin|10 months ago