Lately my company has been doing a lot of complex accounting and reporting in spreadsheets. Overall was surprised by how well both GPT and Claude handled some of these extremely tedious tasks. Not uncommon to have an hours-long task compressed to minutes.
My anecdotal experience is GPT 5.2 Pro is decently ahead of Claude Opus 4.5 in this category when it gets to the tricky stuff, both in presentation and accuracy. The long reasoning seems to help a lot. But, apparently the benchmarks do not agree.
Based on the article... is this basically just making Claude better at formatting and data presentation, or does it also get better at analysis? I get the impression it's the former.
The benchmarks look good. Slide decks and spreadsheets look better. The people must use Claude Cowork and have their Claude Code moment and figure out the consequences. It will be really interesting to see articles like this (https://mitchellh.com/writing/my-ai-adoption-journey) written by people who actually care about accuracy in places like KPMG to get their perspective on things.
I remember over hearing some normal people on the bus talking about essentially orchestrating some agent scraper to pull and summarise news from 40 different sites he identified as important which put him quite ahead of his peers. These were non-technical people orchestrating an agent workflow to make them better at work.
Though there’s not much that tickles my software brain here. But the agents are coming for us all.
And then you hand it to your boss who takes a 20 second look at it and asks why you made a projection that assume massive revenue growth and 3 years of perfectly flat utilities, insurance, G&A - no inflation etc.
It does look really promising as a skeleton starting point though. Like generate it, delete numbers and populate by hand.
Not unlike the boilerplate start we saw in AI coding a couple years back
> The side-by-side outputs below show how output quality has improved from Claude Opus 4.5 to Opus 4.6.
Disclaimer: I use AI to code (and I code for finance) and I love Anthropic.
But: for f-ck's sake, I cannot click on the picture and have it show up in full. It stays at its tiny size, impossible to read the numbers. I had to right-click and "open in a new tab".
AI is, somehow, definitely still not fully there yet.
You'll use a ton of AI but it won't wipe the humans out. In the end you'll have a compositional change, likely nothing catastrophic imo. In part because there is a buck to stop and Claude ain't got no hands...
Anthropic does anything to keep the Claude hype going; from fearmongering ("AI bad, need government regulations") to wishful thinking ("90% of code will be written by AI by the end of 2025" —Dario) to using Claude in applications it has no business being in (Cowork, accessing all your files, what could go wrong?) to releasing "research" papers every now and then to show how their AI "almost got out" and they stopped it (again, to show their models are "just that good") to prescribing what the society should do to adapt to the new reality to doing worthless surveys on "how AI is reshaping economy, but mostly our AI not others".
Now this is going to be interesting to watch to see if the finance bros financing this AI wave to get rid of SW engineers will keep financing getting rid of their own.
typpo|25 days ago
My anecdotal experience is GPT 5.2 Pro is decently ahead of Claude Opus 4.5 in this category when it gets to the tricky stuff, both in presentation and accuracy. The long reasoning seems to help a lot. But, apparently the benchmarks do not agree.
Edit - noticed OpenAI specifically focuses on finance use cases in their gpt-5.3-codex blog as well https://openai.com/index/introducing-gpt-5-3-codex/
someuser54541|25 days ago
unknown|25 days ago
[deleted]
belter|25 days ago
sdf2erf|25 days ago
[deleted]
bovermyer|25 days ago
4corners4sides|23 days ago
I remember over hearing some normal people on the bus talking about essentially orchestrating some agent scraper to pull and summarise news from 40 different sites he identified as important which put him quite ahead of his peers. These were non-technical people orchestrating an agent workflow to make them better at work.
Though there’s not much that tickles my software brain here. But the agents are coming for us all.
Havoc|25 days ago
It does look really promising as a skeleton starting point though. Like generate it, delete numbers and populate by hand.
Not unlike the boilerplate start we saw in AI coding a couple years back
onlypassingthru|25 days ago
This is called 'Month End Close' in accountant speak.
TacticalCoder|25 days ago
Disclaimer: I use AI to code (and I code for finance) and I love Anthropic.
But: for f-ck's sake, I cannot click on the picture and have it show up in full. It stays at its tiny size, impossible to read the numbers. I had to right-click and "open in a new tab".
AI is, somehow, definitely still not fully there yet.
warabe|25 days ago
Ntrails|25 days ago
Bombthecat|24 days ago
Since demand is so insane high. We will just get into an equilibrium.
Not like developers, where like 25 percent of people want to do something with IT...
sdf2erf|25 days ago
eggsby|25 days ago
behnamoh|25 days ago
cadamsdotcom|23 days ago
henning|25 days ago
layer8|25 days ago
storus|24 days ago