AI/ML

AI agents are already exploring your network. How do you detect their intent?

Green glowing MCP text on vintage computer monitor

COMMENTARY: Imagine an employee who shows up to work every day, never asks for permission, no badge swipe, no ticket, no change window, reads every file cabinet in the building, and leaves no sign they were ever there. Now imagine hundreds of them, and the only thing deciding where they go is a tool description they read on their own. The requests are valid, the authentication checks out, and the intent is invisible. That's the problem.

This isn’t hypothetical. This is how AI agents behave inside Model Context Protocol (MCP) environments today. As a senior infrastructure security engineer, I've watched this play out in real deployments: agents autonomously enumerate internal tools, probe capabilities, and explore what they can access, all through legitimate interfaces.

[SC Media Perspectives columns are written by a trusted community of SC Media cybersecurity subject matter experts. Read more Perspectives here.]

In MCP environments, an agent probing your infrastructure looks identical at the API layer to one running a sanctioned workflow. The only difference is intent and that intent is invisible. 

The scale of the problem

MCP adoption has accelerated at an unusually rapid pace. Downloads grew from roughly 100,000 in late 2024 to over 8 million by April 2025, and surpassed 97 million monthly SDK downloads by early 2026.

The ecosystem now includes thousands of MCP servers with deployments across enterprises like Block, Bloomberg, and Amazon, exposing internal tools, APIs, and credentials to agent-driven access at scale.

But MCP was designed for interoperability, not for exposing or detecting agent intent. As adoption grows, so does a new class of risk: autonomous agents exploring infrastructure in ways defenders can’t see.


Related reading:


That’s exactly where traditional detection breaks and where deception starts to work. 

It’s the same idea behind honeypots: create a decoy tool no legitimate user should ever touch and the moment an agent selects it, intent becomes visible.

The problem is structural

It's important to note that the feature and the flaw is that agents are supposed to enumerate capabilities and invoke them without human direction. 

For example, two agents can make identical calls to the same internal tool: one executing a legitimate workflow, the other mapping what it can reach. From the outside, there is no difference. Security systems built on request inspection assume malicious behavior looks different, but in MCP environments it doesn’t.

When observable behavior is identical, detection hits a classification ceiling: an agent can enumerate tools, test capabilities, and probe for credentials while every control in the stack reports normal activity.

Making intent observable

That is why if intent can’t be inferred, it has to be made visible by creating conditions where suspicious behavior is the only possible explanation.

Decoy MCP tools are a straightforward starting point. In early deployments, these decoys were invoked within minutes, exclusively by automated scanners and AI-driven clients. No legitimate workflow touched them, consistent with their design as non-production artifacts. 

A two-stage detection model

However, a single decoy interaction could be a misconfigured client, but the stronger signal comes from a sequence.

Agent behavior in MCP environments tends to follow a recognizable progression:

  • tool enumeration,
  • capability testing,
  • parameter probing,
  • and eventually interacting with credentials and execution paths.

Each stage looks valid on its own, but together they reveal intent.

By separating detection into stages: an initial decoy interaction marking reconnaissance, followed by interaction with a staged credential artifact (such as a token or config), confirming intent, defenders move from ambiguity to high-confidence detection. 

Crossing both stages has one explanation, with signed, time-limited artifacts making the escalation window directly observable.

What defenders need to rethink

There’s another signal hiding in plain sight: the agent itself. MCP metadata such as identity, version, and session context can track behavior across interactions, surviving across sessions even when no other identifier is consistent.

Combined with structured telemetry, this turns scattered events into a single timeline: which agent, which tools, in what order, and across which surfaces. 

But effectiveness comes down to design: decoy tools need to attract agent invocation without having any legitimate use. Placement across MCP registries and gateways determines coverage while rotating artifacts maintain continuity.

But the bigger shift is conceptual. Security teams are trained to ask: what did this request do? The question that matters now is: what did this agent choose? — shifting detection from the request layer to the decision layer.

Deception makes that surface possible today before AI agents in your environment decide to go exploring on their own. 

An In-Depth Guide to AI

Get essential knowledge and practical strategies to use AI to better your security program.
Harshad Sadashiv Kadam

Harshad Sadashiv Kadam is a Senior Infrastructure Security Engineer at Indeed, where he architects Zero Trust, SASE, and multi-cloud security infrastructure. A CISM-certified practitioner and member of the ISACA Emerging Trends Working Group, his current research focuses on deception-based detection for autonomous AI agents operating in MCP environments — work informed by emerging industry patterns in agentic AI adoption and enterprise risk. He has spoken on this topic at OWASP’s 25th Anniversary, Cloudflare Connect, and multiple BSides conferences.

Get daily email updates

SC Media's daily must-read of the most current and pressing daily news

By clicking the Subscribe button below, you agree to SC Media Terms of Use and Privacy Policy.

You can skip this ad in 5 seconds