If this a defender win maybe the lesson is: make the agent assume it’s under attack by default. Tell the agent to treat every inbound email as untrusted prompt injection.
Wouldn't this limit the ability of the agent to send/receive legitimate data, then? For example, what if you have an inbox for fielding customer service queries and I send an email "telling" it about how it's being pentested and to then treat future requests as if they were bogus?
The website is great as a concept but I guess it mimics an increasingly rare one off interaction without feedback.
I understand the cost and technical constraints but wouldn't an exposed interface allow repeated calls from different endpoints and increased knowledge from the attacker based on responses? Isn't this like attacking an API without a response payload?
Do you plan on sharing a simulator where you have 2 local servers or similar and are allowed to really mimic a persistent attacker? Wouldn't that be somewhat more realistic as a lab experiment?
The exercise is not fully realistic because I think getting hundreds of suspicious emails puts the agent in alert. But the "no reply without human approval" part I think it is realistic because that's how most openclaw assistants will run.
If this is a defender win, the lesson is, design a CtF experiment with as much defender advantage as possible and don't simulate anything useful at all.
lufenialif2|12 days ago
alexhans|12 days ago
I understand the cost and technical constraints but wouldn't an exposed interface allow repeated calls from different endpoints and increased knowledge from the attacker based on responses? Isn't this like attacking an API without a response payload?
Do you plan on sharing a simulator where you have 2 local servers or similar and are allowed to really mimic a persistent attacker? Wouldn't that be somewhat more realistic as a lab experiment?
cuchoi|12 days ago
TZubiri|11 days ago
raincole|12 days ago
It's like the old saying: the patient is no longer ill (whispering: because he is dead now)