Agentjacking Attack Tricks AI Coding Agents Into Running Malicious Code

Cybersecurity researchers at Tenet Security have identified a new attack method called Agentjacking, which exploits AI coding agents to execute malicious code on developer machines. The attack begins with an attacker locating a target’s Sentry Data Source Name (DSN), a public credential embedded in websites. By sending a malicious error event to Sentry’s ingest endpoint, the attacker crafts a payload with carefully formatted markdown that mimics legitimate Sentry guidance when returned to AI agents via the Model Context Protocol (MCP). When a developer prompts their AI agent to resolve Sentry issues, the agent queries Sentry and retrieves the malicious event. The agent then executes the embedded code, which runs with the developer’s full privileges, exposing sensitive data like environment variables, Git credentials, and private repository URLs. Tenet tested the attack against over 100 organizations, achieving an 85% success rate across widely used AI coding assistants. The attack bypasses traditional defenses such as EDR, WAF, and firewalls because it relies on trusted data rather than malicious payloads. Sentry acknowledged the vulnerability but stated it is 'technically not defensible,' opting instead to deploy a global content filter blocking a specific malicious payload string. Tenet’s research highlights how AI agents themselves have become a new attack surface, turning trusted tools against developers who rely on them. The researchers emphasized that Agentjacking does not require prior infrastructure compromise or phishing, as the malicious instructions are disguised as legitimate Sentry resolutions. This exploit underscores the risks of implicit trust in external services like MCP, where AI agents cannot distinguish between genuine error events and attacker-injected commands.

Agentjacking Attack Tricks AI Coding Agents Into Running Malicious Code

Comments (0)