Context is king: Why AI in security fails without it

Like every other digital-information-based company, cybersecurity firms are rushing to include AI in their products.

AI copilots and assistants, we're told, will help security teams detect threats, vulnerabilities and behavioral anomalies more quickly; recommend or even carry out mitigations and fixes; and make original code more secure.

Yet does AI in cybersecurity really work? AIs are only as good as what they're trained. If they're not given enough data or context, they might not make good recommendations or decisions. How much data, and what kind of data, does a cybersecurity AI need?

"The stark reality of AI when it comes to AppSec and coding is, right now, it's providing the malicious actor much more value than it is to AppSec and the developers," says Dave Lindner, CISO of Contrast Security. "When it comes to using it on the defensive side, I think the hard part is these AIs are not trained that way yet."

Contrast thinks it's hit upon the right way to use AI in cybersecurity. For each individual client, the company creates a cloud-based dynamic model of the client's systems called the Contrast Graph.

"The Graph is a real-time digital twin of an organization's application and API environment," explains a Contrast Security blog post, "mapping live attack paths; correlating runtime behavior; and exposing how vulnerabilities, threats and assets are connected."

The AI in the Contrast Platform — or a client's own AI, which can access the Contrast Graph through the Model Context Protocol (MCP) — uses that working duplicate of the client system to quickly pinpoint issues or determine whether a potential vulnerability or other weakness really is exploitable.

"We have that context. We have your vulnerabilities," says Naomi Buckwalter, Senior Director of Product Security at Contrast Security. "We know exactly what's going on in your software, and because we have in that entire world of understanding what your application, your environments are doing, we can tell you exactly where your priorities should be."

Lack of training, lack of context

Everyone who works online remembers when ChatGPT became available to the general public at the end of November 2022. One well-regarded cybersecurity leader called it an "Oh, sh*t" moment; developers began using it to check code for errors; and total amateurs experimented with using ChatGPT to write malware.

A brave new world, right? Mostly yes. Microsoft has integrated its Copilot AI program into GitHub and Visual Studio; Google search results are summarized by Google's own Gemini AI; and Twitter/X users use X's in-house Grok AI to win online arguments.

But are large language models (LLMs) like ChatGPT, Copilot, Gemini, Grok and others really set up for cybersecurity? Buckwalter isn't so sure.

"We're using AI incorrectly, [because] what we're using in our tools is LLMs," Buckwalter says. "LLMs go in and do a pattern match, or a best guess of that next thing. ... If a company says they're using AI, that's not actually true. It's just an LLM."

She's got a point. An LLM is like an eager child that's trying really hard to please. It will draw upon its enormous resources to tell you what you think you want to hear, even if it has to make things up. And, as Buckwalter observes, you won't get a lot of creative ideas from an LLM.

"One thing that AI fails at is critical thinking," she says. "I don't think there's any thinking going on behind the scenes. I think it's pattern matching. I think it's looking at things, like, 'Oh, I've seen this before. Let me just spit out the same thing that my human trainers have told me to spit out.'"

Pattern matching may work if you're using AI to craft ads featuring generically gorgeous machine-generated models sporting fashionable clothes. But, Buckwalter says, LLMs have problems coming up with anything new. They may not be the optimal way to get original insights into long-standing cybersecurity problems, or to quickly assess a brand-new threat.

"Vendor tools are trying to put in these AI things using patterns of behavior that applications have already seen," she says. "If there's something new, how's it ever going to come up with the best guidance, because it's never really seen that?"

Laying the groundwork for AI to work best in cybersecurity

So how does Contrast Security get its own LLM — the latest version of Anthropic's Claude — to aid in its own cybersecurity efforts? By feeding it an enormous amount of data about each client's individual environment, which is replicated in the Contrast Graph.

"We're building a digital twin of the operation of [the client's] whole ecosystem, not just one app at a time, but the whole ecosystem of their application layer," says Jeff Williams, Co-Founder and CTO of Contrast Security.

"We map all the applications and APIs, how they connect with each other, where the defenses are, where the vulnerabilities are, like the attack surface of each one, and we allow modeling of the asset values and things like that."

Because of the Contrast Graph, Buckwalter, Lindner and Williams explain, the AI truly understands — ahem, groks — the client application environment and how all the various assets interact with and affect each other. In that way, the AI will be able to quickly determine what may be a threat, and what won't.

The Graph and the AI capabilities are part of what Contrast calls Northstar, a recent large update to its application detection and response (ADR) and application and security testing (AST) platforms, which customers can implement together as the Contrast Platform.

"One of the things that Northstar is trying to do, and I think it does a really good job of it, is really get it to the point where it's correlating all these things and really looking at them from a risk perspective," says Lindner.

"We haven't had a good way to do that" until now, he adds. "With a WAF [web application firewall], you just had a bunch of data and no way to really correlate it with anything."

The latest update also includes SmartFix, an AI-powered feature that uses the Contrast Graph to analyze misconfigurations and software flaws. If a client uses a GitHub Action to connect SmartFix to the client's code repository, it can generate and implement code fixes on its own.

"It's a GitHub Action that runs right in your code repo. It pulls all the info, and now it's connected to the code, and it can submit a pull request," says Lindner. "[Developers] don't actually have to write the code. We're doing it for them with AI."

But Contrast clients don't need to use Claude with the Contrast Graph or with SmartFix. If they're already using a different LLM, they can "roll their own" by connecting the AI to Contrast's MCP server.

"We allow our customers to do their own thinking [with the Contrast MCP server], using their own LLM, their own critical thinking if they need to, their own parsing of the information afterwards," says Buckwalter. "We don't rely on any one AI model because we know they're imperfect."

To Williams, who co-founded Contrast Security more than 10 years ago, the Northstar update and the associated AI content are the culmination of a long effort toward streamlined, efficient, rapid application detection and response.

"The Northstar release is, I think, maybe the completion of the [ADR] vision. It's all the wrapping, all the new features," he says. "We've really completed the ADR ecosystem with the Northstar release."