COMMENTARY: The emergence of agentic AI — autonomous systems capable of reasoning, decision-making, and tool usage — has created new demands for infrastructure.

These agents, driven by LLMs, are becoming increasingly powerful, engaging with tools, retrieving data, and performing tasks, requiring rich, structured context.

To make this possible, the model context protocol (MCP) was created as a common protocol for applications to communicate and more effectively manage digital work with LLMs.

However, as agentic AI transitions from experimental environments into enterprise-level deployments, the limitations of traditional MCP implementations, particularly local configurations, have grown increasingly apparent. Although these configurations get the job done, they fall short when it comes to the strength, security, and scalability that actual production environments need.

Our engineering team has taken a fresh architectural approach we call MCP Server-as-a-Service (MCPSaaS). It aims to reshape the MCP infrastructure to deliver a secure, scalable, and easily manageable foundation for the next generation of AI agents.

The role and relevance of MCP

This isn’t a product. Rather, it’s a reference architecture we are offering to the industry that engineering teams can use (for free) to develop fault-tolerant, secure, and scalable MCP implementations.

MCP, an open standard that lets AI applications access external tools and data, helps developers build agents and workflows on top of LLMs, creating seamless integrations between the models and other systems that hold important context.

MCP allows for a growing ecosystem of built-in integrations for widely used tools, along with a standardized means for building custom integrations, and the ability to shift between applications while preserving context.

The protocol uses messaging transports such as JSON-RPC 2.0, HTTP REST, gRPC, WebSockets and other message‐oriented protocols to foster communication between hosts (LLM applications that initiate connections), clients (internal connectors), and servers (which provide tools, data, and prompts).

The limitations of the early MCP architectures

This framework lets AI agents access live data, perform tasks, and generate more relevant responses. But as deployments scale, the cracks in conventional MCP setups begin to surface.

Early implementations of MCP, especially those running locally, were adequate for experimentation, but they fell short in production environments, unable to endure the demands of modern AI systems. These setups bring forth a number of operational and security issues:

First, updates and bug fixes must be manually distributed, leading to non-uniform environments and delayed patching. Second, basic authentication mechanisms are inadequate for enterprise security. Teams can store credentials in plaintext or unencrypted formats, opening systems to vulnerability. Third, communication takes place through inter-process channels, so it’s often difficult to monitor, audit, or control agent activity. Finally, idle MCP server processes may continue to use memory and CPU, degrading performance over time.

How MCPSaaS fixes these issues

These constraints hinder scalability, and also threaten the reliability and security of agentic AI systems.

Streamable HTTP transport: Replaces Server-Sent Events (SSE) with a modern, bidirectional streaming protocol. This allows real-time communication between agents and tools with better reliability and flexibility. Containerized runtime: MCP server components run in containers orchestrated by platforms like Kubernetes. This allows dynamic scaling, high availability, and session-level isolation. OAuth 2.1-based authorization: Aligns with MCP standards for secure identity control. Each user session begins with a secure token exchange, ensuring least-privilege access and robust authentication. High-efficiency session caching: Uses encrypted, in-memory stores — such as Redis with in-VM encryption — to cache sessions, and retrieves tokens only at runtime from a secure vault. This reduces latency, improves responsiveness under load, and maintains critical security controls. Strong user isolation: Redis keys are generated per-user via a hash of the MCP session ID and an internal bearer token scoped exclusively to that session. Tokens grant access only to the MCP layer and never to underlying resources. The resource server token gets securely stored and never exposed, ensuring strict separation between sessions and preventing cross-user data leakage. Secure and resilient stateful session storage: Containers operate in stateful mode, replicating session state across peers using encrypted, in-memory stores. This ensures seamless session recovery in case of failover, maintaining continuity, consistency, and high resiliency under load.

MCPSaaS introduces a cloud-native architecture for MCP infrastructure. Many of the innovations in MCPSaaS are the following:

MCPSaaS promises centralized control over security policies and updates, granular visibility into agent-tool interactions, strong isolation between sessions and users, and seamless scalability for multi-user deployments.

These combined capabilities turn MCP from a basic connector into a robust, enterprise-grade platform for agentic AI. In addition, MCPSaaS takes advantage of modern development practices, such as monorepo architectures , which let teams manage multiple MCP clients, servers, and integrations from a single, consolidated codebase.

As agentic AI advances, its supporting infrastructure must co-evolve as well. MCP set the path for standardized AI-tool interaction, but MCPSaaS takes it to revolutionary levels by providing a secure, scalable, and future-proof platform for intelligent systems. By resolving the constraints of legacy MCP architectures and bringing forward a resilient model for next-generation deployments, MCPSaaS promises best practices for the evolving agentic AI infrastructure.

