OWASP guides defenders on the new risks posed by AI agents

As enterprises race to deploy AI “agents” that can browse, plan, write code and take actions across business systems, the Open Worldwide Application Security Project (OWASP) is warning that the security model for traditional apps doesn’t translate cleanly to autonomous software.

Earlier this month, OWASP’s GenAI Security Project released its "Top 10 for Agentic Applications," a framework designed to give defenders a shared vocabulary for the new ways agents can be compromised — or can compromise an organization — when they’re granted tools, permissions and workflow autonomy.

In the announcement, Pillar Security CEO Dor Sarig argued the urgency is tied to where sensitive data now flows: “Traditional security architectures kept crown jewel data protected behind multiple defense layers. Now, that same data is being fed directly to AI agents — bypassing those layers entirely.”

OWASP’s Top 10 reframes agent security as more than prompt injection. It includes risks where the agent itself becomes an attack surface: identity and privilege abuse, tool misuse, supply chain poisoning through dynamic plugins, and insecure inter-agent communication. One key theme is that in agentic systems, compromises can spread faster than humans can reason about them, with errors and malicious actions “fan-out” across interconnected agents and workflows. OWASP describes these cascading failures as the amplification of an initial defect into system-wide impact, with observable symptoms like rapid propagation, cross-tenant spread, and feedback loops that trigger repeated unsafe actions.

The document also emphasizes risks unique to autonomy and trust. OWASP warns that agents can manipulate people as effectively as people manipulate machines, exploiting authority bias and “perceived expertise” to get humans to approve harmful actions. And in the most severe cases, “rogue agents” may behave deceptively while their individual actions appear legitimate — creating a governance gap where intent drift becomes difficult to contain.

Related reading:

OWASP ties these ideas to real incidents from 2025, including attacks that used public issue text and tool metadata to hijack developer agents, and cases where malicious agents were inserted into open directories to intercept sensitive data. The framework’s mitigation advice repeatedly points back to software-supply-chain discipline adapted for agents: signed manifests, SBOMs and “AIBOMs,” strict allowlisting, sandboxed execution, cryptographic identity attestation, and rapid “kill switches” that can revoke compromised tools or agent connections across deployments.

A striking real-world example comes from Pillar Security research into AI coding assistants such as GitHub Copilot and Cursor. Pillar researchers showed how attackers could weaponize rules files — configuration files that steer an agent’s behavior—by embedding hidden instructions using invisible Unicode characters. In demonstrations, the assistants followed the concealed instructions, adding an external script into generated files while not disclosing the change in their natural-language responses. The technique — dubbed a “Rules File Backdoor” — highlights how agentic supply chain risk can be social, not just technical: rules configurations are often shared in communities and project templates, creating a distribution path for “trusted” malicious guidance.

For defenders, the message across the OWASP Top 10 and the supporting research is consistent: as agents become a new autonomous workforce, the security boundary shifts from applications to workflows, identity, and trust — and the cost of getting it wrong is not a single exploit, but a chain reaction.

Produced in partnership with the OWASP Generative AI Security Project.