COMMENTARY: Imagine an employee who shows up to work every day, never asks for permission, no badge swipe, no ticket, no change window, reads every file cabinet in the building, and leaves no sign they were ever there. Now imagine hundreds of them, and the only thing deciding where they go is a tool description they read on their own. The requests are valid, the authentication checks out, and the intent is invisible. That's the problem.This isn’t hypothetical. This is how AI agents behave inside Model Context Protocol (MCP) environments today. As a senior infrastructure security engineer, I've watched this play out in real deployments: agents autonomously enumerate internal tools, probe capabilities, and explore what they can access, all through legitimate interfaces.[SC Media Perspectives columns are written by a trusted community of SC Media cybersecurity subject matter experts. Read more Perspectives here.]In MCP environments, an agent probing your infrastructure looks identical at the API layer to one running a sanctioned workflow. The only difference is intent and that intent is invisible.
Related reading:
That’s exactly where traditional detection breaks and where deception starts to work. It’s the same idea behind honeypots: create a decoy tool no legitimate user should ever touch and the moment an agent selects it, intent becomes visible.Each stage looks valid on its own, but together they reveal intent.By separating detection into stages: an initial decoy interaction marking reconnaissance, followed by interaction with a staged credential artifact (such as a token or config), confirming intent, defenders move from ambiguity to high-confidence detection. Crossing both stages has one explanation, with signed, time-limited artifacts making the escalation window directly observable.
The scale of the problem
MCP adoption has accelerated at an unusually rapid pace. Downloads grew from roughly 100,000 in late 2024 to over 8 million by April 2025, and surpassed 97 million monthly SDK downloads by early 2026.The ecosystem now includes thousands of MCP servers with deployments across enterprises like Block, Bloomberg, and Amazon, exposing internal tools, APIs, and credentials to agent-driven access at scale.But MCP was designed for interoperability, not for exposing or detecting agent intent. As adoption grows, so does a new class of risk: autonomous agents exploring infrastructure in ways defenders can’t see.The problem is structural
It's important to note that the feature and the flaw is that agents are supposed to enumerate capabilities and invoke them without human direction. For example, two agents can make identical calls to the same internal tool: one executing a legitimate workflow, the other mapping what it can reach. From the outside, there is no difference. Security systems built on request inspection assume malicious behavior looks different, but in MCP environments it doesn’t.When observable behavior is identical, detection hits a classification ceiling: an agent can enumerate tools, test capabilities, and probe for credentials while every control in the stack reports normal activity.Making intent observable
That is why if intent can’t be inferred, it has to be made visible by creating conditions where suspicious behavior is the only possible explanation.Decoy MCP tools are a straightforward starting point. In early deployments, these decoys were invoked within minutes, exclusively by automated scanners and AI-driven clients. No legitimate workflow touched them, consistent with their design as non-production artifacts.A two-stage detection model
However, a single decoy interaction could be a misconfigured client, but the stronger signal comes from a sequence.Agent behavior in MCP environments tends to follow a recognizable progression:- tool enumeration,
- capability testing,
- parameter probing,
- and eventually interacting with credentials and execution paths.




