MCP servers designed for agents, accuracy, and token-efficiency

MCP servers designed for agents, accuracy, and token-efficiency

Production-ready
MCP servers optimized for agentic coding and context limits

Stainless generates code-mode MCP servers from your OpenAPI spec, letting agents use your SDKs and documentation instead of dozens of endpoint schemas.

Stainless generates code-mode MCP servers from your OpenAPI spec, letting agents use your SDKs and documentation instead of dozens of endpoint schemas.

Stainless generates code-mode MCP servers from your OpenAPI spec, letting agents use your SDKs and documentation instead of dozens of endpoint schemas.

94–97%

task accuracy

3x

fewer tool calls

100k+

tokens saved on complex tasks

Supercharge agentic coding

Supercharge agentic coding

Supercharge agentic coding

AI is already writing your users’ code. The question is whether it can integrate with your API without breaking.

AI is already writing your users’ code. The question is whether it can integrate with your API without breaking.

AI is already writing your users’ code. The question is whether it can integrate with your API without breaking.

Stainless generates MCP servers that make your API legible to coding agents. In tools like Claude Code and Cursor, agents can interpret your endpoints and generate working integrations with your SDK.

Build integrations when inspiration strikes, without compromising on precision.

Stainless generates MCP servers that make your API legible to coding agents. In tools like Claude Code and Cursor, agents can interpret your endpoints and generate working integrations with your SDK.

Build integrations when inspiration strikes, without compromising on precision.

Stainless generates MCP servers that make your API legible to coding agents. In tools like Claude Code and Cursor, agents can interpret your endpoints and generate working integrations with your SDK.

Build integrations when inspiration strikes, without compromising on precision.

If you're building for agents that need to do real work, Stainless Code Mode is the architecture to adopt.

Ayush Agarwal

Founder/CPTO

If you're building for agents that need to do real work, Stainless Code Mode is the architecture to adopt.

Ayush Agarwal

Founder/CPTO

If you're building for agents that need to do real work, Stainless Code Mode is the architecture to adopt.

Ayush Agarwal

Founder/CPTO

If you're building for agents that need to do real work, Stainless Code Mode is the architecture to adopt.

Ayush Agarwal

Founder/CPTO

The problem: traditional MCP doesn't scale

The problem: traditional MCP doesn't scale

The problem: traditional MCP doesn't scale

Many MCP servers expose one tool per endpoint (or rely on dynamic discovery). This floods the context window with hundreds of static definitions or forces slow, multi-step discovery loops.

Many MCP servers expose one tool per endpoint (or rely on dynamic discovery). This floods the context window with hundreds of static definitions or forces slow, multi-step discovery loops.

Many MCP servers expose one tool per endpoint (or rely on dynamic discovery). This floods the context window with hundreds of static definitions or forces slow, multi-step discovery loops.

The solution: SDK code mode

The solution: SDK code mode

The solution: SDK code mode

Instead of forcing agents to guess which tool to use from a massive list, we let them do what they do best: write code.

Instead of forcing agents to guess which tool to use from a massive list, we let them do what they do best: write code.

Instead of forcing agents to guess which tool to use from a massive list, we let them do what they do best: write code.

With SDK Code Mode, your Stainless MCP server exposes two powerful tools to the agent:

With SDK Code Mode, your Stainless MCP server exposes two powerful tools to the agent:

With SDK Code Mode, your Stainless MCP server exposes two powerful tools to the agent:

1. search_docs: To read and understand your API's documentation.

2. execute: To run TypeScript code using your Stainless-generated SDK in a secure sandbox.

1. search_docs: To read and understand your API's documentation.

2. execute: To run TypeScript code using your Stainless-generated SDK in a secure sandbox.

1. search_docs: To read and understand your API's documentation.

2. execute: To run TypeScript code using your Stainless-generated SDK in a secure sandbox.

Higher accuracy. Fewer tokens. Faster results.

Stainless Code Mode outperforms in task accuracy, uses a fraction of the tokens by eliminating unnecessary data dumps, and drastically reduces the "thinking time" required for complex tasks.

Accuracy

Agents write code using the idiomatic Stainless-generated TypeScript SDKs. They get built-in auto-pagination, typed errors, and intuitive parameters: less trial-and-error and faster task completion.

Token use

We don't just give agents code tools; we give them the manual. The search_docs tool gives agents access to a comprehensive, up-to-date API reference tailored specifically to your SDK.

Duration

SDK code mode reduces the time agents need for complex API tasks. By allowing models to generate and execute SDK code with type hints and error feedback, agents complete workflows in fewer turns.

How it works

Connect and Initialize

The client connects to the MCP server, which advertises exactly two capabilities: search_docs and execute, plus any brief startup instructions (auth, SDK setup, common patterns).

tools: [
  { "name": "search_docs", "description": "Search SDK documentation" },
  { "name": "execute", "description": "Run TypeScript code with the SDK" }
]

Understand the user’s goal

The agent reads the user request and decides what information it needs (entities, resources, filters, desired output) before touching any tools.

User:
"Summarize the incidents from last night"

Agent reasoning

Search the docs

The agent calls search_docs with a targeted query to retrieve the most relevant SDK methods, parameter shapes, and minimal examples.

-> tool_call 
search_docs("incidents last 24 hours")

<- tool_response
client.incidents.list()
Returns a list of incidents.

Parameters:
• status
• severity
• created_after

Example

Draft typed SDK code

Using the doc results, the agent writes a small TypeScript program that uses the generated SDK to perform the task, including pagination/looping, filtering, and aggregation as needed.

async function run(client) {
  const incidents = await client.incidents.list({
    status: "active",
  });

  return incidents.filter(i => {
    return Date.parse(i.created_at) > Date.now() - 86400000;
  });
}

Typecheck, execute, and iterate

The agent sends the code to execute_code; the server typechecks first, runs in a sandbox, and returns results or actionable errors. If there’s an error, the agent fixes the code and retries.


Return the final answer

The agent summarizes the executed result into a user-friendly response, typically returning only the distilled outputs (counts, IDs, totals, links), not raw payloads.


“The automatic MCP server is an excellent feature and it works very well! I've deleted my hand-rolled MCP repo.”

Matthew Blode

Co-founder/CTO

“The automatic MCP server is an excellent feature and it works very well! I've deleted my hand-rolled MCP repo.”

Matthew Blode

Co-founder/CTO

“The automatic MCP server is an excellent feature and it works very well! I've deleted my hand-rolled MCP repo.”

Matthew Blode

Co-founder/CTO

“The automatic MCP server is an excellent feature and it works very well! I've deleted my hand-rolled MCP repo.”

Matthew Blode

Co-founder/CTO

FAQ

How does this differ from other MCP implementations?

Most MCP servers expose one tool per endpoint or use dynamic discovery. Code Mode exposes just two tools: execute and search-docs. Agents write code that calls your SDK directly. This prevents context overflow and eliminates multi-step discovery performance penalties.

Is code execution secure?

Yes. Code runs in isolated Cloudflare Workers per request. TypeScript analysis happens before execution. No code persists between requests.

What happens when my API changes?

Stainless provides automated GitHub workflows. When your OpenAPI spec changes, your MCP server regenerates automatically with deterministic, merge-conflict-free updates.

How do I deploy the MCP server?

Multiple options: publish as NPM package for local use, deploy to Cloudflare Workers with one-click deployment, or use Docker containers for self-hosting.

Does this work with my existing Stainless SDKs?

Yes. If you have Stainless-generated SDKs, add the MCP target to your config. Your SDK becomes immediately accessible to agents via Code Mode.

How does this handle large APIs with hundreds of endpoints?

The agent discovers only the methods it needs via doc search, rather than loading hundreds of schemas into context upfront. This keeps token usage and decision complexity constant, even as the API surface grows. In practice, performance scales with the size of the task, not the number of endpoints in the API.

How does this differ from other MCP implementations?

Most MCP servers expose one tool per endpoint or use dynamic discovery. Code Mode exposes just two tools: execute and search-docs. Agents write code that calls your SDK directly. This prevents context overflow and eliminates multi-step discovery performance penalties.

Is code execution secure?

Yes. Code runs in isolated Cloudflare Workers per request. TypeScript analysis happens before execution. No code persists between requests.

What happens when my API changes?

Stainless provides automated GitHub workflows. When your OpenAPI spec changes, your MCP server regenerates automatically with deterministic, merge-conflict-free updates.

How do I deploy the MCP server?

Multiple options: publish as NPM package for local use, deploy to Cloudflare Workers with one-click deployment, or use Docker containers for self-hosting.

Does this work with my existing Stainless SDKs?

Yes. If you have Stainless-generated SDKs, add the MCP target to your config. Your SDK becomes immediately accessible to agents via Code Mode.

How does this handle large APIs with hundreds of endpoints?

The agent discovers only the methods it needs via doc search, rather than loading hundreds of schemas into context upfront. This keeps token usage and decision complexity constant, even as the API surface grows. In practice, performance scales with the size of the task, not the number of endpoints in the API.

How does this differ from other MCP implementations?

Most MCP servers expose one tool per endpoint or use dynamic discovery. Code Mode exposes just two tools: execute and search-docs. Agents write code that calls your SDK directly. This prevents context overflow and eliminates multi-step discovery performance penalties.

Is code execution secure?

Yes. Code runs in isolated Cloudflare Workers per request. TypeScript analysis happens before execution. No code persists between requests.

What happens when my API changes?

Stainless provides automated GitHub workflows. When your OpenAPI spec changes, your MCP server regenerates automatically with deterministic, merge-conflict-free updates.

How do I deploy the MCP server?

Multiple options: publish as NPM package for local use, deploy to Cloudflare Workers with one-click deployment, or use Docker containers for self-hosting.

Does this work with my existing Stainless SDKs?

Yes. If you have Stainless-generated SDKs, add the MCP target to your config. Your SDK becomes immediately accessible to agents via Code Mode.

How does this handle large APIs with hundreds of endpoints?

The agent discovers only the methods it needs via doc search, rather than loading hundreds of schemas into context upfront. This keeps token usage and decision complexity constant, even as the API surface grows. In practice, performance scales with the size of the task, not the number of endpoints in the API.

How does this differ from other MCP implementations?

Most MCP servers expose one tool per endpoint or use dynamic discovery. Code Mode exposes just two tools: execute and search-docs. Agents write code that calls your SDK directly. This prevents context overflow and eliminates multi-step discovery performance penalties.

Is code execution secure?

Yes. Code runs in isolated Cloudflare Workers per request. TypeScript analysis happens before execution. No code persists between requests.

What happens when my API changes?

Stainless provides automated GitHub workflows. When your OpenAPI spec changes, your MCP server regenerates automatically with deterministic, merge-conflict-free updates.

How do I deploy the MCP server?

Multiple options: publish as NPM package for local use, deploy to Cloudflare Workers with one-click deployment, or use Docker containers for self-hosting.

Does this work with my existing Stainless SDKs?

Yes. If you have Stainless-generated SDKs, add the MCP target to your config. Your SDK becomes immediately accessible to agents via Code Mode.

How does this handle large APIs with hundreds of endpoints?

The agent discovers only the methods it needs via doc search, rather than loading hundreds of schemas into context upfront. This keeps token usage and decision complexity constant, even as the API surface grows. In practice, performance scales with the size of the task, not the number of endpoints in the API.

Create a code mode MCP server for your API

Generate MCP servers where agents use your SDK directly. No tool explosion, no context bloat.