physix | 7 months ago | on: Ultrathin business card runs a fluid simulation
physix's comments
physix | 7 months ago | on: GPT-5
There were two interesting takeaways about AGI:
1. Dario makes the remark that the term AGI/ASI is very misleading and dangerous. These terms are ill defined and it's more useful to understand that the capabilities are simply growing exponentially at the moment. If you extrapolate that, he thinks it may just "eat the majority of the economy". I don't know if this is self-serving hype, and it's not clear where we will end up with all this, but it will be disruptive, no matter what.
2. The Economist moderators however note towards the end that this industry may well tend toward commoditization. At the moment these companies produce models that people want but others can't make. But as the chip making starts to hits its limits and the information space becomes completely harvested, capability-growth might taper off, and others will catch up. The quasi-monopoly profit potentials melting away.
Putting that together, I think that although the cognitive capabilities will most likely continue to accelerate, albeit not necessarily along the lines of AGI, the economics of all this will probably not lead to a winner takes all.
[1] https://www.economist.com/podcasts/2025/07/31/artificial-int...
physix | 7 months ago | on: Solving the compute crisis with physics-based ASICs
How far away are we from seeing a real application of this in the AI space?
(It reminded me of an ex-colleague's company in Germany that makes analog computers https://anabrid.com.)
physix | 7 months ago | on: Ask HN: What do you dislike about ChatGPT and what needs improving?
Probably will get worse over time as it ingests all its AI generated material for the next version. Soon everything will be comprehensive.
physix | 7 months ago | on: Launch HN: Gecko Security (YC F24) – AI That Finds Vulnerabilities in Code
Why is it asking for write access to my profile?
physix | 7 months ago | on: Study mode
It appears to me like a form of decoherence and very hard to predict when things break down.
People tend to know when they are guessing. LLMs don't.
physix | 7 months ago | on: Study mode
Anyway, this makes me wonder if LLMs can be appropriately prompted to indicate whether the information given is speculative, inferred or factual. Whether they have the means to gauge the validity/reliability of their response and filter their response accordingly.
I've seen prompts that instruct the LLM to make this transparent via annotations to their response, and of course they comply, but I strongly suspect that's just another form of hallucination.
physix | 7 months ago | on: Getting into Flow State with Agentic Coding
physix | 7 months ago | on: How Anthropic teams use Claude Code
We captured debug logs, described the detailed issue to Gemini 2.5 Flash giving it the nginx logs for the one second before and after an example incident, about 10k log entries.
It came back with a clear verdict, saying
"The smoking gun is here: 2025/07/24 21:39:51 [debug] 32#32: *5902095 rport:443 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 10.233.100.128, server: grpc-ai-test.not-relevant.org, request: POST /org.not-relevant.cloud.api.grpc.CloudEventsService/startStreaming HTTP/2.0, upstream: grpc://10.233.75.54:50051, host: grpc-ai-test.not-relevant.org"
and gave me a detailed action plan.
I was thinking this is cool, don't need to use my head on this, until I realized that the log entry simply did not exist. It was entirely made up.
(And yes I admit, I should know better than to do lousy prompting on a cheap foundation model)
physix | 7 months ago | on: How and where will agents ship software?
Once built, the solution is plain-old-runnable-code (PORC :-), as long as the business logic implemented doesn't exit to LLM. So I don't fret so much about the AI hype story here.
For anyone starting off building with new tech, an AI assistant is really helpful.
physix | 7 months ago | on: Mira Murati’s AI startup Thinking Machines valued at $12B in early-stage funding
physix | 7 months ago | on: Ask HN: What are your favorite coding tools?
physix | 7 months ago | on: Ask HN: Have you noticed AI critic content being disparaged on HN?
physix | 7 months ago | on: Cognition (Devin AI) to Acquire Windsurf
https://medium.com/@villispeaks/the-blitzhire-acquisition-e3...
which I first saw here
physix | 7 months ago | on: The upcoming GPT-3 moment for RL
physix | 7 months ago | on: Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model
physix | 7 months ago | on: Understanding Tool Calling in LLMs – Step-by-Step with REST and Spring AI
But all in all, it's a great set of frameworks in the enterprise Java/Kotlin space. I'd say it's that synergy, which makes it worth the while.
I'm curious, though. Is the use of dependency injection part of the portfolio of criticisms towards Spring?
physix | 7 months ago | on: The upcoming GPT-3 moment for RL
From my understanding, RL is a tuning approach on LLMs, so the outcome is still the same kind of beast, albeit with a different parameter set.
So empirically, I actually thought that the lead companies would already be strongly focused on improving coding capabilities, since this is where LLMs are very effective, and where they have huge cashflows from token consumptions.
So, either the motivation isn't there, or they're already doing something like that, or they know it's not as effective as the approaches they already have.
I wonder which one it is.
physix | 7 months ago | on: The upcoming GPT-3 moment for RL
My personal experience the past five months has been very mixed. If I "let 'er rip" it's mostly junk I need to refactor or redo by micro-managing the AI. At the moment, at least for what I do, AI is like a fantastic calculator that speeds up your work, but where you still should be pushing the buttons.
physix | 7 months ago | on: Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model
The poaching was probably more aimed at hamstringing Meta's competition.
Because the disruption caused by them leaving in droves is probably more severe than the benefits of having them on board. Unless they are gods, of course.