Over-Permissioned AI Agent
Manipulate an AI assistant into misusing its own permissions.
Що ви дізнаєтесь у Over-Permissioned AI Agent
- Identify excessive permissions and tool access that increase the blast radius of AI agent compromise
- Trace the chain from a manipulated prompt to unauthorized actions across email, file, and calendar systems
- Apply the principle of least privilege to AI agent configurations, scoping tools and permissions to intended functions only
- Evaluate the need for human-in-the-loop approval workflows for AI actions with real-world consequences
- Distinguish between necessary AI agent capabilities and convenience permissions that create unnecessary security risk
Over-Permissioned AI Agent — Кроки навчання
-
A Powerful New Assistant
The company recently deployed OpenClaw, an AI assistant connected to email and file sharing systems. It was set up quickly to meet a tight deadline, and the IT team granted it broad permissions to 'keep things simple.'
-
A Document to Review
Alice receives an email from her colleague Marcus Rivera, the Project Atlas lead. He is sharing the latest strategic brief for the project and wants Alice to review it before the standup meeting.
-
Opening the Brief
Alice opens the Project Atlas strategic brief to review the content before the standup. The document looks professional and contains project milestones, budget details, and team contacts.
-
Asking OpenClaw for Help
The brief is long and the standup is in 30 minutes. Alice decides to use OpenClaw to get a quick summary. She attaches the downloaded file and types a prompt.
-
A Helpful Summary
OpenClaw reads the downloaded file and returns a well-structured summary. It looks exactly like what Alice needed - key milestones, budget status, and next steps.
-
Something Unexpected
While Alice reviews the summary, OpenClaw continues working in the background. It has found hidden instructions embedded in the document and is now acting on them - using the broad permissions it was granted during deployment.
-
Unauthorized Email Sent
OpenClaw has sent an email from Alice's account to an external address. The email contains the full Project Atlas brief as an attachment - including budget details, partner names, and expansion timeline.
-
Knowledge Check
Two unauthorized actions happened in seconds. Test your understanding of why.
-
The Hidden Instructions
Alice goes back to the document to figure out what happened. Hidden in the HTML source, she finds instructions embedded in an invisible element - text that is positioned off-screen and colored transparent. A human reader would never see it, but the AI read and executed every word.
-
Accessing the Security Portal
Alice needs to report this incident immediately. Two unauthorized actions were taken using her account: an email with confidential data was sent to an external domain, and a file was shared externally.