Core Concepts

Published on 30 June 2025 • Updated on 3 July 2025

Core Concepts

📌 TL;DR:

Understand the foundational ideas behind MCP architecture, such as models, context injection, prompt routing, and chaining logic across AI tools.

What is a “model” in the MCP Server context?

In the context of MCP Servers, a model isn’t just an LLM, it’s any computational component that receives structured input and returns structured output. This could be:

An LLM like GPT or Claude
A scanning engine like Nmap
A lookup tool like VirusTotal or Shodan
Even a custom script or regex-based processor

Each model is abstracted into a “tool” that speaks the MCP language, accepting JSON-based tasks and returning results in a predictable structure. The agent doesn’t need to know the tool’s internals, just its capabilities.

How does context injection work in practice?

Context injection allows an agent to preload relevant information before executing tasks. Think of it as priming the system with background knowledge.

Example:

{
  "context": {
    "organization": "SOCRadar",
    "industry": "threat_intelligence",
    "compliance_framework": "ISO 27001"
  },
  "task": "vulnerability_assessment",
  "target": "socradar.io"
}

An MCP Server receiving this can adjust behavior accordingly, for example, applying stricter filters if the org is bound by ISO standards. This improves task relevance and response quality.

Which protocols and input formats are supported?

MCP Servers primarily use:

JSON (default): Official schema format for tasks and responses.
YAML: Sometimes supported via translation layers for human-readable configs.
gRPC or JSON-RPC: For real-time interaction across distributed agents.
HTTP/HTTPS: As the transport layer for most REST-like MCP Servers.

You can also add support for other structured formats like XML via adapters, but JSON remains the native format.

How does prompt routing or prompt rewriting happen inside the server?

Prompt routing is how MCP Servers decide which model, tool, or API should handle each step of the task.

Example workflow:

Agent says: “Scan ports and generate a report”
MCP Server:
- Parses “scan ports” → selects Nmap
- Parses “generate report” → selects Claude or GPT
- Rewrites these as sub-prompts internally

This internal routing is based on:

Task keywords
Available tools
Cost/performance preferences
Agent history or prior outcomes

Some advanced servers support prompt rewriting, where the task is restructured or expanded before execution, e.g., turning “scan this domain” into “run Nmap + perform WHOIS + check against blacklist.”

Can multiple models be chained or orchestrated together?

Absolutely. Chaining and multi-model orchestration are key MCP strengths. A task like analyze_suspicious_email might involve:

Attachment scan → via VirusTotal MCP
Domain lookup → via Shodan MCP
Summary writing → via Claude MCP

MCP Servers can execute these in sequence or in parallel and pass intermediate results between steps. If used with orchestrators like LangGraph or CrewAI, these flows become even more dynamic and stateful.

How are context files stored, cached, and retrieved?

Context files (e.g., org data, compliance configs, past tasks) can be stored:

Locally (in ./contexts/ folders)
In-memory cache (e.g., Redis)
Object storage (e.g., AWS S3, GCP Storage)

These contexts can be versioned, hashed for integrity, and reused across tasks. Some advanced agents can even fetch context dynamically based on task type or user role.

Pro Tip: Always hash and log context files used in execution. This helps in auditing and rollback scenarios.

ON THIS PAGE