Why Model Context Protocol (MCP) Enables Insecure AI Architectures

Model Context Protocol security risks: agentic AI, prompt injection, excessive agency, and insecure tool execution

Dec 14, 2025

Model Context Protocol (MCP) is often described as a neutral transport standard — a “pipe” that allows large language models (LLMs) to discover and invoke external tools.

That description is technically correct.
It is also dangerously incomplete.

MCP standardizes and accelerates an architectural pattern — agentic AI with LLM-driven tool execution — that is fundamentally misaligned with established security principles. The protocol itself is not broken, but it prioritizes interoperability and ease of connection over control, validation, and containment.

This assessment aligns with current industry consensus, including the OWASP Top 10 for LLM Applications, and evaluates MCP not as a piece of code, but as an enabler of high-risk system design.

1. MCP Collapses Trust Boundaries by Design

(OWASP LLM02: Insecure Output Handling)

A trust boundary exists wherever data crosses from a less-trusted domain into a more-trusted one.

In an MCP architecture:

The LLM is non-deterministic, hallucination-prone, and susceptible to manipulation
The Host accepts the model’s output as a decision
The MCP Server executes privileged actions

This places the LLM inside the decision loop for real system operations.

The Host becomes a bridge that converts untrusted model output into executable intent. This is the textbook definition of insecure output handling, directly mapping to OWASP LLM02.

MCP does not merely permit this pattern — it formalizes and normalizes it.

2. Prompt Injection Escalates into a System-Level Attack

(OWASP LLM01: Prompt Injection / Confused Deputy)

In agentic systems, prompt injection is no longer about manipulating text output. It is about manipulating behavior.

With MCP, attackers can perform indirect prompt injection by embedding hidden instructions in:

PDFs
Web pages
Resumes
Emails
User-generated content

Flow:

The model reads malicious content
The model decides to invoke a tool
The Host executes the request using its own credentials

This is a classic confused deputy attack:

The Host is the deputy with authority (API keys, access tokens)
The Model is the manipulated actor
The attacker exploits the trust relationship between them

This attack class is well documented and already demonstrated in real systems. MCP provides a standardized, low-friction path from reading untrusted data to executing privileged actions.

3. Tool Aggregation Creates Excessive Agency and Expands Blast Radius

(OWASP LLM08: Excessive Agency)

To be useful, MCP-based agents are typically connected to many tools:

File systems
Databases
Payment APIs
CI/CD pipelines
Internal services

In traditional architectures:

Service A talks to Service B
Permissions are narrow and explicit

In MCP agent architectures:

The agent often has access to A, B, C, D, and E simultaneously

If the agent is compromised (via prompt injection or misbehavior), the attacker inherits all of those capabilities.

This is the precise definition of OWASP LLM08: Excessive Agency — granting an AI system more power, autonomy, or permissions than can be safely constrained.

MCP makes this architectural choice easy and attractive.

4. Authorization Is Explicitly Out of Scope — and That Is the Risk

The MCP specification focuses on:

Tool discovery
Invocation
Transport-level authentication (e.g., OAuth)

It does not standardize:

Fine-grained authorization
Context-aware permissions
Per-action policy enforcement

As a result:

Authorization logic is pushed into the Host
Each tool implements controls differently
Systems fail inconsistently — and often fail open

From a security standpoint, this is not a neutral omission. It is a known failure mode.

5. Tool Schemas Validate Shape, Not Safety

MCP tools are defined using JSON Schema.

JSON Schema can answer:

“Is this parameter a string?”
“Is this field required?”

It cannot answer:

“Is this operation dangerous?”
“Is this intent acceptable?”
“Should this ever execute automatically?”

A schema will happily validate:

DROP TABLE users;

Semantic safety cannot be encoded in JSON Schema, and MCP provides no native mechanism to express or enforce it.

6. Auditing Is Fragmented Without Standardized Correlation

(Forensics Risk)

Secure systems require traceability:

Which prompt caused this action?
Which model decision triggered it?
Which authorization allowed it?

MCP does not mandate:

A standardized correlation_id
Prompt-to-tool trace propagation
Unified, tamper-resistant audit logs

In practice, teams end up with:

Host logs: “User said X”
Server logs: “Action Y executed”

With no guaranteed link between cause and effect, incident response and forensic analysis become significantly harder — sometimes impossible.

7. Least Privilege Is Structurally Incompatible with Agent Autonomy

Least Privilege requires:

Minimal permissions
Granted only when needed
Explicit justification per action

Agentic MCP systems require:

Broad permissions granted upfront
Autonomous planning and execution
Flexible, unscripted behavior

This is not a bug. It is a theoretical conflict.

Autonomy requires power.
Security requires restraint.

MCP clearly optimizes for the former.

8. Human-in-the-Loop (HITL) Is a Weak Long-Term Control

MCP supports Human-in-the-Loop confirmation for sensitive actions.

This helps initially — then degrades.

Security research consistently shows:

Alert fatigue
Habituation
Reflexive approval

HITL mitigates accidents.
It does not stop adversaries.

9. The Market Confirms the Risk: Security Proxies Are Now Required

The industry response is unambiguous.

Organizations are deploying LLM Security Proxies / AI Firewalls between:

The Model
The MCP Host
The MCP Server

These layers enforce:

Intent validation
Policy controls
Tool allow/deny lists
Contextual authorization
Unified auditing

This works — and it proves the point.

If MCP were safe by default, these systems would not be necessary.

Final Assessment

This is not an argument against MCP as a protocol.

It is an argument against treating untrusted model output as executable intent.

MCP is analogous to HTTP:

Neutral
Powerful
Dangerous when used without compensating controls

Using HTTP for banking without TLS is negligent.
Using MCP for privileged actions without a security proxy is the same.

The Bottom Line

MCP does not make systems insecure.
It makes insecure agentic architectures easy to build, standardize, and deploy.

Until LLMs are deterministic, verifiable, and resistant to manipulation, any system that allows models to drive privileged actions should be treated as high risk by default.

Interoperability is valuable.
Security is non-negotiable.

MCP chooses interoperability — and leaves security to you.

Dec 14

I decided to write this article after seeing a growing number of companies adopt Model Context Protocol (MCP) in production systems without fully understanding the security implications of agentic AI architectures.

In many cases, MCP is implemented because it is well-documented, standardized, and easy to integrate — not because its risk profile has been carefully evaluated. The result is that highly privileged systems are being connected to non-deterministic models under the assumption that “the protocol will handle safety.”

This article is not meant to discourage experimentation or innovation. Its purpose is to highlight risks that are already well-understood in the security community, but are often overlooked in the rush to ship AI-enabled features.

My goal is simple: help teams make informed architectural decisions before those decisions become security incidents.

Sasu's AI

Discussion about this post

Ready for more?