Some insider knowledge: Lilli was, at least a year ago, internal only. VPN access, SSO, all the bells and whistles, required. Not sure when that changed.
McKinsey requires hiring an external pen-testing company to launch even to a small group of coworkers.
I can forgive this kind of mistake on the part of the Lilli devs. A lot of things have to fail for an "agentic" security company to even find a public endpoint, much less start exploiting it.
That being said, the mistakes in here are brutal. Seems like close to 0 authz. Based on very outdated knowledge, my guess is a Sr. Partner pulled some strings to get Lilli to be publicly available. By that time, much/most/all of the original Lilli team had "rolled off" (gone to client projects) as McKinsey HEAVILY punishes working on internal projects.
So Lilli likely was staffed by people who couldn't get staffed elsewhere, didn't know the code, and didn't care. Internal work, for better or worse, is basically a half day.
This is a failure of McKinsey's culture around technology.
McKinsey has a weird structure where there are too many cooks in the kitchen.
Everybody there is reviewed on client impact, meaning it ends up being an everybody-for-themselves situation.
So as a developer you have little guidance (in fact, you're still being reviewed on client impact, even if you have 0 client exposure).
Then a (Senior) Partner comes in with this idea (that will get them a good review), and you jump on that. After all, it's all you can do to get a good review.
You work on it, and then the (Senior) Partner moves on. But it's not done. It's enough for the review, but continuing to work on it doesn't bring you anything, in fact, it will actually pull you down, as finishing the project doesn't give immediate client results.
So what does this mean? Most products of McKinsey are a grab-bag of raw ideas of leadership, implemented as a one-off, without a cohesive vision or even a long-term vision at all. It's all about the review cycle.
McKinsey is trying to do software like they do their other engagements. It doesn't work. You can't just do something for 6 months and then let it go. Software rots.
The fact that they laid off a good amount of (very good) software engineers in 2024 is a reflection on how they see software development.
And McKinsey's people, who go to other companies, take those ideas with them. Result: The UI of your project changes all the time, because everybody is looking at the short-term impact they have that gets them a good review, not what is best for the project in the long term.
> One of those unprotected endpoints wrote user search queries to the database. The values were safely parameterised, but the JSON keys — the field names — were concatenated directly into SQL.
I was expecting prompt injection, but in this case it was just good ol' fashioned SQL injection, possible only due to the naivety of the LLM which wrote McKinsey's AI platform.
The tacit knowledge to put oauth2-proxy in front of anything deployed on the Internet will nonetheless earn me $0 this year, while Anthropic will make billions.
I just wonder how much professional grade code written by LLMs, "reviewed" by devs, and commited that made similar or worse mistakes. A funny consequence of the AI boom, especially in coding, is the eventual rise in need for security researchers.
I don’t love the title here. Maybe this is a “me” problem, but when I see “AI agent does X,” the idea that it might be one of those molt-y agents with obfuscated ownership pops into my head.
In this case, a group of pentesters used an AI agent to select McKinsey and then used the AI agent to do the pentesting.
While it is conventional to attribute actions to inanimate objects (car hits pedestrians), IMO we should be more explicit these days, now that unfortunately some folks attribute agency to these agentic systems.
If true, it's quite irresponsible. They are admitting to allowing a agent to autonomously execute code on the network. Autonomously perform hacking activities.
> This was McKinsey & Company — a firm with world-class technology teams [...]
Not exactly the word on the street in my experience. Is McKinsey more respected for software than I thought? Otherwise I'm curious why TFA didn't just politely leave this bit out.
> Not exactly the word on the street in my experience.
Depends on the street you're on. Are you on Main Street or Wall Street?
If you're hiring them to help with software for solving a business problem that will help you deliver value to your customers, they're probably just like anyone else.
If you're hiring them to help with software for figuring out how to break down your company for scrap, or which South African officials to bribe, well, that's a different matter.
No, they don't have world class technology teams, they hire contractors to do all the tech stuff, their expertise is in management, yes that's world class.
I've got no idea who codewall is. Is there acknowledgment from McKinsey that they actually patched the issue referenced? I don't see any reference to "codewall ai" in any news article before yesterday and there's no names on the site.
- "The agent mapped the attack surface and found the API documentation publicly exposed — over 200 endpoints, fully documented. Most required authentication. Twenty-two didn't."
> named after the first professional woman hired by the firm in 1945
Going out of their way to find a woman's name for an AI assistant and bragging about it is not as empowering as the creators probably thought in their heads.
What I don't see in this article that should be explicit:
If your data is in this database, it's gone. Other people have it. Your sensitive data that you handed over to their teams has vanished in a puff of smoke. You should probably ask if your data was part of the leak.
Fail to see how a state actor would not have come across this already.
Not exactly clear from the link: were they doing red team work for McKinsey or is this just "we found a company we thought wouldn't get us arrested and ran an AI vuln detector over their stuff"?
You'd think that the world's "most prestigious consulting firm" would have already had someone doing this sort of work for them.
From TFA: "Fun fact: As part of our research preview, the CodeWall research agent autonomously suggested McKinsey as a target citing their public responsible diclosure policy (to keep within guardrails) and recent updates to their Lilli platform. In the AI era, the threat landscape is shifting drastically — AI agents autonomously selecting and attacking targets will become the new normal."
Could the author please provide the prompt that was used to vibe write this blog post? The topic is interesting, but I would rather read the original prompt, as I am not sure which parts still match what the author wanted to say, vs flowerly formulations for captivating reading that the LLM produced.
Flagging this because 1) this was written by an LLM and 2) there's bad information in it, which means it wasn't reviewed particularly carefully by a human.
This means the entire article is suspect as a result.
One interesting takeaway here is how quickly organizations are deploying AI tools internally without fully adapting their security models.
Traditional application security assumes fairly predictable inputs and workflows, but LLM-based systems introduce entirely new attack surfaces—prompt injection, data leakage, tool misuse, etc.
It feels like many enterprises are still treating these systems as just another SaaS product rather than something closer to an autonomous system that needs a different threat model...
I think the underlying point is valid. Agents are a potential tool to add to your arsenal in addition to "throw shit at the wall and see what sticks" tools like WebInspect, Appscan, Qualys, and Acunetix.
[+] [-] frankfrank13|14 days ago|reply
McKinsey requires hiring an external pen-testing company to launch even to a small group of coworkers.
I can forgive this kind of mistake on the part of the Lilli devs. A lot of things have to fail for an "agentic" security company to even find a public endpoint, much less start exploiting it.
That being said, the mistakes in here are brutal. Seems like close to 0 authz. Based on very outdated knowledge, my guess is a Sr. Partner pulled some strings to get Lilli to be publicly available. By that time, much/most/all of the original Lilli team had "rolled off" (gone to client projects) as McKinsey HEAVILY punishes working on internal projects.
So Lilli likely was staffed by people who couldn't get staffed elsewhere, didn't know the code, and didn't care. Internal work, for better or worse, is basically a half day.
This is a failure of McKinsey's culture around technology.
[+] [-] OptionOfT|14 days ago|reply
McKinsey has a weird structure where there are too many cooks in the kitchen.
Everybody there is reviewed on client impact, meaning it ends up being an everybody-for-themselves situation.
So as a developer you have little guidance (in fact, you're still being reviewed on client impact, even if you have 0 client exposure).
Then a (Senior) Partner comes in with this idea (that will get them a good review), and you jump on that. After all, it's all you can do to get a good review.
You work on it, and then the (Senior) Partner moves on. But it's not done. It's enough for the review, but continuing to work on it doesn't bring you anything, in fact, it will actually pull you down, as finishing the project doesn't give immediate client results.
So what does this mean? Most products of McKinsey are a grab-bag of raw ideas of leadership, implemented as a one-off, without a cohesive vision or even a long-term vision at all. It's all about the review cycle.
McKinsey is trying to do software like they do their other engagements. It doesn't work. You can't just do something for 6 months and then let it go. Software rots.
The fact that they laid off a good amount of (very good) software engineers in 2024 is a reflection on how they see software development.
And McKinsey's people, who go to other companies, take those ideas with them. Result: The UI of your project changes all the time, because everybody is looking at the short-term impact they have that gets them a good review, not what is best for the project in the long term.
[+] [-] cmiles8|14 days ago|reply
[+] [-] eisa01|14 days ago|reply
McKinsey challenges graduates to use AI chatbot in recruitment overhaul: https://www.ft.com/content/de7855f0-f586-4708-a8ed-f0458eb25...
[+] [-] j45|14 days ago|reply
They look to package up something and sell it as long as they can.
AI solutions won't have enough of a shelf life, and the thought around AI is evolving too quickly.
Very happy to be wrong and learn from any information folks have otherwise.
[+] [-] dahcryn|14 days ago|reply
[+] [-] joenot443|14 days ago|reply
I was expecting prompt injection, but in this case it was just good ol' fashioned SQL injection, possible only due to the naivety of the LLM which wrote McKinsey's AI platform.
[+] [-] doctorpangloss|14 days ago|reply
[+] [-] simonw|14 days ago|reply
I thought we might finally have a high profile prompt injection attack against a name-brand company we could point people to.
[+] [-] 3abiton|14 days ago|reply
[+] [-] oliver_dr|14 days ago|reply
[deleted]
[+] [-] bee_rider|14 days ago|reply
In this case, a group of pentesters used an AI agent to select McKinsey and then used the AI agent to do the pentesting.
While it is conventional to attribute actions to inanimate objects (car hits pedestrians), IMO we should be more explicit these days, now that unfortunately some folks attribute agency to these agentic systems.
[+] [-] simonw|14 days ago|reply
[+] [-] causal|14 days ago|reply
[+] [-] tasuki|14 days ago|reply
You're doing that by calling them "agentic systems".
[+] [-] dang|14 days ago|reply
[+] [-] newtwilly|13 days ago|reply
> No human in the loop
If true, it's quite irresponsible. They are admitting to allowing a agent to autonomously execute code on the network. Autonomously perform hacking activities.
[+] [-] fhd2|14 days ago|reply
Not exactly the word on the street in my experience. Is McKinsey more respected for software than I thought? Otherwise I'm curious why TFA didn't just politely leave this bit out.
[+] [-] aerhardt|14 days ago|reply
[+] [-] lenerdenator|14 days ago|reply
Depends on the street you're on. Are you on Main Street or Wall Street?
If you're hiring them to help with software for solving a business problem that will help you deliver value to your customers, they're probably just like anyone else.
If you're hiring them to help with software for figuring out how to break down your company for scrap, or which South African officials to bribe, well, that's a different matter.
[+] [-] alexpotato|14 days ago|reply
- understanding existing systems
- what the paint points are
- making suggestions on how to improve those systems given the paint points
- that includes a mix of tech changes, process updates and/or new systems etc
Now, when it comes to implementing this, in my experience it usually ends up being the already in place dev teams.
Source: worked at a large investment bank that hired McKinsey and I knew one of the consultants from McK prior to working at the bank.
[+] [-] sharadov|14 days ago|reply
[+] [-] sigmar|14 days ago|reply
https://www.google.com/search?q=codewall+ai
[+] [-] gbourne1|14 days ago|reply
Well, there you go.
[+] [-] paxys|14 days ago|reply
Going out of their way to find a woman's name for an AI assistant and bragging about it is not as empowering as the creators probably thought in their heads.
[+] [-] cmiles8|14 days ago|reply
They’ve long been all hype no substance on AI and looks like not much has changed.
They might be good at other things but would run for the hills if McKinsey folks want to talk AI.
[+] [-] sailfast|14 days ago|reply
If your data is in this database, it's gone. Other people have it. Your sensitive data that you handed over to their teams has vanished in a puff of smoke. You should probably ask if your data was part of the leak.
Fail to see how a state actor would not have come across this already.
[+] [-] sgt101|14 days ago|reply
Surely this should all have been behind the firewall and accessible only from a corporate device associated mac address?
[+] [-] sd9|14 days ago|reply
[+] [-] vanillameow|14 days ago|reply
[+] [-] causal|14 days ago|reply
> No credentials. No insider knowledge. And no human-in-the-loop. Just a domain name and a dream.
It just sounds so stupid.
[+] [-] unknown|14 days ago|reply
[deleted]
[+] [-] lenerdenator|14 days ago|reply
You'd think that the world's "most prestigious consulting firm" would have already had someone doing this sort of work for them.
[+] [-] frereubu|14 days ago|reply
[+] [-] bxguff|14 days ago|reply
[+] [-] dmix|14 days ago|reply
[+] [-] nubg|14 days ago|reply
[+] [-] phyzome|14 days ago|reply
This means the entire article is suspect as a result.
[+] [-] StartupsWala|14 days ago|reply
Traditional application security assumes fairly predictable inputs and workflows, but LLM-based systems introduce entirely new attack surfaces—prompt injection, data leakage, tool misuse, etc.
It feels like many enterprises are still treating these systems as just another SaaS product rather than something closer to an autonomous system that needs a different threat model...
[+] [-] ecshafer|14 days ago|reply
[+] [-] bananamogul|14 days ago|reply
[+] [-] sgarland|14 days ago|reply
Apparently not.
[+] [-] nullcathedral|14 days ago|reply