while1's comments

while1 | 1 month ago | on: The future belongs to those who can refute AI, not just generate with AI

We're building AI testing tools at QA.tech and this matches my experience. Great post. The hard part was never generating code. It's figuring out if what came out is actually correct. Our team runs multiple AI agents in parallel writing code and honestly we spend way more time on verification than generation at this point. The ratio keeps getting worse as the models get better at producing plausible-looking stuff.

The codebase growth numbers feel right to me. Even conservative 2x productivity gains break most review processes. We ended up having to build our own internal review bot that checks the AI output because human review just doesn't keep up. But it has to be narrow and specific, not another general model doing vibes-based review.

while1 | 5 months ago | on: Ask HN: What's your experience with using graph databases for agentic use-cases?

We are using neo4j to power our agents at QA.tech.

Essentially to make it behave more like a human so that it learns and builds up an understanding of the pages it should test we map interactions into a knowledge graph stored in Neo4j. The consists of Pages and Actions on the page as well as links to documentation sections and other relevant info, together with descriptions, metadata and embeddings for search.

To make the agents better at planning and understanding the context of the page it can search the graph for relevant information and expand through the graph for more context.

This works remarkably well. I think our agents (when they have interacted a bit w the page) are some of the best browser agents I have tested.

I would highly recommend this but you need to put some effort into a nice ontology for the graph and making the tolling right for your use case. Its really not just plug and play. :)

while1 | 8 months ago | on: Show HN: Opper AI – Task-Completion API for LLMs

Really cool tool! Great job!

while1 | 1 year ago | on: Show HN: An AI that reliably builds full-stack apps by preventing LLM mistakes

This is such a great tool for developing something quickly to visualize an idea you have to other ppl. So cool!

while1 | 1 year ago | on: Using Multimodal LLMs to Understand UI Elements on Websites

Loving this! Very surprising that the LLMs of today are so bad at understanding interfaces but it also makes it a very interesting case for finetuning!

while1 | 8 years ago | on: LaTeX Math in MS Office

Neither does markdown. Just put it in a git repo.

while1 | 12 years ago | on: King James Programming

The global environment is chosen here, because this is the will of God.

while1 | 12 years ago | on: Show HN: Make your app handle going offline

Cool! This looks really nice and definitely useful!

while1 | 12 years ago | on: Visualise the structure of a spreadsheet

This looks awesome! I quite often stumble upon the task of automating or porting spreadsheets into code. This is always a pain as it might in some ways be hard to visualize the flow in a chart. This tool would greatly simplify the task. Looks really sweet!

while1 | 13 years ago | on: Gmail.com was down

Same for me as well.

while1 | 13 years ago | on: Codecademy now has Python lessons

Nice stuff, I've tried to get a few friends to try python and this is a great way to get them started.

A bit sad it is only Python 2 though. Some of the stuff thought is not compatible with Python 3.