top | item 46886350

(no title)

avoutic | 27 days ago

WardGate also tackles "deleting all meetings"-kind of attacks, at least if you choose to. So for my setup, I allow calendar reading, but updating and editing, requires an approval by me.

So you would configure this:

  endpoints:
    calendar:
      preset: google-calendar
      auth:
        credential_env: WARDGATE_CRED_GOOGLE_CALENDAR
      capabilities:
        read_data: allow
        create_events: allow
        update_events: ask
        delete_events: ask

So updating or deleting events requires human permission.

There are already time controls and rate-limiting included.

On the list for things to develop is an LLM model adapter as well, that could detect prompt injection, but also identity-masking and credential-triggering-approvals. Anomaly detection is on the todo.

The threat model is agents deliberately (because of gullibility, prompt injection, or dumb actions) leaking data and either detecting that early on or preventing such things.

discuss

No comments yet.