# MCP Python SDK Source: https://py.sdk.modelcontextprotocol.io/v2/ !!! info "You are viewing the in-development v2 documentation" For the current stable release, see the [v1.x documentation](https://py.sdk.modelcontextprotocol.io/). The **Model Context Protocol (MCP)** lets applications provide context to LLMs in a standardized way, separating the concern of *providing* context from the LLM interaction itself. This is the official Python SDK for it. With it you can: * **Build MCP servers** that expose tools, resources, and prompts to any MCP host. * **Build MCP clients** that connect to any MCP server. * Speak every standard transport: stdio, Streamable HTTP, and SSE. ## Requirements Python 3.10+. ## Installation === "uv" ```bash uv add "mcp[cli]==2.0.0a3" ``` === "pip" ```bash pip install "mcp[cli]==2.0.0a3" ``` The `[cli]` extra gives you the `mcp` command; you'll want it for development. !!! warning "Pin the version while v2 is in alpha" Installers never select a pre-release unless you name one, so an unpinned `uv add "mcp[cli]"` gives you the latest **v1.x** release, which this documentation does not describe. Check [PyPI](https://pypi.org/project/mcp/#history) for the newest alpha before you copy the line above. See [Installation](https://py.sdk.modelcontextprotocol.io/v2/installation/index.md) for the details. ## Example ### Create it Create a file `server.py`: ```python title="server.py" # docs_src/index/tutorial001.py from mcp.server import MCPServer mcp = MCPServer("Demo") @mcp.tool() def add(a: int, b: int) -> int: """Add two numbers.""" return a + b @mcp.resource("greeting://{name}") def greeting(name: str) -> str: """Greet someone by name.""" return f"Hello, {name}!" ``` That's a complete MCP server. It exposes one **tool**, `add`, and one templated **resource**, `greeting://{name}`. ### Run it ```console uv run mcp dev server.py ``` This starts your server and opens the [MCP Inspector](https://github.com/modelcontextprotocol/inspector), an interactive UI for poking at it. Open the URL it prints. !!! note The Inspector is a Node.js app, so `mcp dev` needs `npx` on your `PATH`. ### Try it In the Inspector, go to **Tools** and call `add` with `a=1`, `b=2`. You get `3` back. ✨ The Inspector built that form (a required integer field for `a`, another for `b`) from your type hints. So will Claude, and every other MCP host. Now go to **Resources** and read `greeting://World`: ```text Hello, World! ``` ### Recap Look again at what you did **not** write: * No JSON Schema. `a: int, b: int` *is* the schema. * No request parsing, no serialization, no validation code. * No protocol handling at all. You wrote two Python functions with type hints and a docstring. The SDK does the rest. ## Where to go next * The **[Tutorial](https://py.sdk.modelcontextprotocol.io/v2/tutorial/index.md)** walks through everything a server can do, one small step at a time. * Migrating from v1? Start with the **[Migration Guide](https://py.sdk.modelcontextprotocol.io/v2/migration/index.md)**. * Hunting for an exact signature? The **[API Reference](https://py.sdk.modelcontextprotocol.io/v2/api/mcp/)** is generated from the source. * Reading with an LLM? This documentation is also published in the [llms.txt](https://llmstxt.org/) format: [llms.txt](https://py.sdk.modelcontextprotocol.io/v2/llms.txt) is an index of the pages, and [llms-full.txt](https://py.sdk.modelcontextprotocol.io/v2/llms-full.txt) contains every page in a single file. # Installation Source: https://py.sdk.modelcontextprotocol.io/v2/installation/ The Python SDK is on PyPI as [`mcp`](https://pypi.org/project/mcp/). It requires **Python 3.10+**. These docs describe **v2**, which is in alpha, so the version pin is not optional yet: === "uv" ```bash uv add "mcp[cli]==2.0.0a3" ``` === "pip" ```bash pip install "mcp[cli]==2.0.0a3" ``` !!! warning "Why the pin" Installers never select a pre-release unless you name one, so an unpinned `uv add "mcp[cli]"` gives you the latest **v1.x** release, which these docs do not describe. Check the [release history](https://pypi.org/project/mcp/#history) for the newest alpha before you copy the line above. The same applies to one-off commands: `uv run --with "mcp==2.0.0a3" ...`, not `uv run --with mcp ...`. If your *package* depends on `mcp`, add a `<2` upper bound (for example `mcp>=1.27,<2`) before the stable v2 lands so the major version bump doesn't surprise you. ## What gets installed You don't need to know any of this to use the SDK, but if you're wondering what each dependency is for: * `mcp-types`: every protocol type (requests, results, content blocks) as its own package, versioned in lockstep with the SDK. Every `from mcp_types import ...` in these docs is this package. * [`anyio`](https://anyio.readthedocs.io/): the async runtime. The whole SDK is written against anyio, so it runs on either `asyncio` or `trio`. * [`pydantic`](https://docs.pydantic.dev/): what every `mcp_types` model is built on, plus all schema generation and validation. * [`pydantic-settings`](https://docs.pydantic.dev/latest/concepts/pydantic_settings/): server configuration via `MCP_*` environment variables and `.env` files. * [`httpx`](https://www.python-httpx.org/) and [`httpx-sse`](https://pypi.org/project/httpx-sse/): the HTTP client behind the Streamable HTTP and SSE *client* transports. * [`starlette`](https://www.starlette.io/), [`uvicorn`](https://www.uvicorn.org/), [`sse-starlette`](https://pypi.org/project/sse-starlette/), and [`python-multipart`](https://pypi.org/project/python-multipart/): the HTTP *server* transports. * [`jsonschema`](https://pypi.org/project/jsonschema/): validates a tool's structured output against its declared output schema. * [`pyjwt[crypto]`](https://pyjwt.readthedocs.io/): OAuth token handling for authorization. * [`opentelemetry-api`](https://opentelemetry-python.readthedocs.io/): just the lightweight API, so the SDK's tracing middleware costs nothing unless you install an OpenTelemetry SDK and exporter yourself. * [`typing-extensions`](https://typing-extensions.readthedocs.io/) and [`typing-inspection`](https://pypi.org/project/typing-inspection/): modern typing features on Python 3.10. * [`pywin32`](https://pypi.org/project/pywin32/): Windows only, used for `stdio` subprocess management. ## Optional extras * `mcp[cli]` adds [`typer`](https://typer.tiangolo.com/) and [`python-dotenv`](https://pypi.org/project/python-dotenv/) for the `mcp` command-line tool (`mcp dev`, `mcp run`, `mcp install`). You'll want this during development; you may not need it in a deployed server. * `mcp[rich]` adds [`rich`](https://rich.readthedocs.io/) for nicer server logs. # Migration Guide Source: https://py.sdk.modelcontextprotocol.io/v2/migration/ This guide covers the breaking changes introduced in v2 of the MCP Python SDK and how to update your code. ## Overview Version 2 of the MCP Python SDK introduces several breaking changes to improve the API, align with the MCP specification, and provide better type safety. ## Breaking Changes ### `MCPServer.call_tool()` returns `CallToolResult` `MCPServer.call_tool()` now returns a `CallToolResult` (or an `InputRequiredResult` when a multi-round tool requests further input). It previously advertised `Sequence[ContentBlock] | dict[str, Any]` and leaked the internal conversion shapes (a bare content sequence or a `(content, structured_content)` tuple), forcing callers to re-assemble a `CallToolResult` themselves. If you call `MCPServer.call_tool()` directly, read `.content` and `.structured_content` off the returned `CallToolResult` instead of branching on the result type. ### `MCPServer.get_prompt()` and `read_resource()` may return `InputRequiredResult` Like `call_tool()` above, `MCPServer.get_prompt()` now returns `GetPromptResult | InputRequiredResult` and `MCPServer.read_resource()` returns `Iterable[ReadResourceContents] | InputRequiredResult`: at 2026-07-28 an `@mcp.prompt()` function or an `@mcp.resource()` template function may answer with an `InputRequiredResult` to request client input first (see [Multi-round-trip requests](https://py.sdk.modelcontextprotocol.io/v2/advanced/multi-round-trip/index.md)). If you call these methods directly, narrow with `isinstance` (or `assert not isinstance(result, InputRequiredResult)` when your prompt and resource functions never return one). `Prompt.render()` and `ResourceTemplate.create_resource()` carry the same union. `ctx.read_resource()` inside a handler is unchanged: it still returns content, and raises `RuntimeError` if the resource requests input. A handler that wants to receive the `InputRequiredResult` and forward it as its own result calls `MCPServer.read_resource(uri, context)` directly — but not from a tool whose dependencies elicit via `Resolve(...)`: the resolver owns that tool's `request_state` channel, and a forwarded result's state would clobber it. ### `MCPError` raised from an `@mcp.tool()` handler now surfaces as a JSON-RPC error Raising `MCPError` (or a subclass such as `UrlElicitationRequiredError`) inside an `@mcp.tool()` handler now produces a top-level JSON-RPC error response with the raised `code`, `message`, and `data` intact. Previously the tool wrapper caught it like any other exception and returned `CallToolResult(isError=True)`, which discarded the error code and structured `data`. `MCPError` carries `ErrorData` and is the SDK's protocol-error type — raise it when the request itself should be rejected (missing client capability, elicitation required, invalid parameters). For tool *execution* failures the calling LLM should see and react to, raise any other exception or return `CallToolResult(is_error=True, ...)` directly; that path is unchanged. ### `streamablehttp_client` removed The deprecated `streamablehttp_client` function has been removed. Use `streamable_http_client` instead. **Before (v1):** ```python from mcp.client.streamable_http import streamablehttp_client async with streamablehttp_client( url="http://localhost:8000/mcp", headers={"Authorization": "Bearer token"}, timeout=30, sse_read_timeout=300, auth=my_auth, ) as (read_stream, write_stream, get_session_id): ... ``` **After (v2):** ```python import httpx from mcp.client.streamable_http import streamable_http_client # Configure headers, timeout, and auth on the httpx.AsyncClient http_client = httpx.AsyncClient( headers={"Authorization": "Bearer token"}, timeout=httpx.Timeout(30, read=300), auth=my_auth, follow_redirects=True, ) async with http_client: async with streamable_http_client( url="http://localhost:8000/mcp", http_client=http_client, ) as (read_stream, write_stream): ... ``` v1's internal client set `follow_redirects=True`; set it explicitly when supplying your own `httpx.AsyncClient` to preserve that behavior. ### OAuth `callback_handler` returns `AuthorizationCodeResult` The `callback_handler` passed to `OAuthClientProvider` now returns an `AuthorizationCodeResult` instead of a `tuple[str, str | None]` of `(code, state)`. The new object adds an `iss` field so the client can validate the [RFC 9207](https://datatracker.ietf.org/doc/html/rfc9207) authorization-response issuer ([SEP-2468](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2468)): when the redirect carries an `iss` query parameter it must match the authorization server's issuer, and a missing `iss` is rejected when the server advertised `authorization_response_iss_parameter_supported`. **Before (v1):** ```python async def callback_handler() -> tuple[str, str | None]: params = parse_qs(urlparse(await wait_for_redirect()).query) return params["code"][0], params.get("state", [None])[0] ``` **After (v2):** ```python from mcp.client.auth import AuthorizationCodeResult async def callback_handler() -> AuthorizationCodeResult: params = parse_qs(urlparse(await wait_for_redirect()).query) return AuthorizationCodeResult( code=params["code"][0], state=params.get("state", [None])[0], iss=params.get("iss", [None])[0], ) ``` Forward the `iss` query parameter from the redirect so the validation can run: omitting it makes the flow fail with `OAuthFlowError` against servers that advertise `authorization_response_iss_parameter_supported`, and silently skips the check for servers that send `iss` without advertising it. ### `get_session_id` callback removed from `streamable_http_client` The `get_session_id` callback (third element of the returned tuple) has been removed from `streamable_http_client`. The function now returns a 2-tuple `(read_stream, write_stream)` instead of a 3-tuple. If you need to capture the session ID (e.g., for session resumption testing), you can use httpx event hooks to capture it from the response headers: **Before (v1):** ```python from mcp.client.streamable_http import streamable_http_client async with streamable_http_client(url) as (read_stream, write_stream, get_session_id): async with ClientSession(read_stream, write_stream) as session: await session.initialize() session_id = get_session_id() # Get session ID via callback ``` **After (v2):** ```python import httpx from mcp.client.streamable_http import streamable_http_client # Option 1: Simply ignore if you don't need the session ID async with streamable_http_client(url) as (read_stream, write_stream): async with ClientSession(read_stream, write_stream) as session: await session.initialize() # Option 2: Capture session ID via httpx event hooks if needed captured_session_ids: list[str] = [] async def capture_session_id(response: httpx.Response) -> None: session_id = response.headers.get("mcp-session-id") if session_id: captured_session_ids.append(session_id) http_client = httpx.AsyncClient( event_hooks={"response": [capture_session_id]}, follow_redirects=True, ) async with http_client: async with streamable_http_client(url, http_client=http_client) as (read_stream, write_stream): async with ClientSession(read_stream, write_stream) as session: await session.initialize() session_id = captured_session_ids[0] if captured_session_ids else None ``` ### `StreamableHTTPTransport` parameters removed The `headers`, `timeout`, `sse_read_timeout`, and `auth` parameters have been removed from `StreamableHTTPTransport`. Configure these on the `httpx.AsyncClient` instead (see example above). Note: `sse_client` retains its `headers`, `timeout`, `sse_read_timeout`, and `auth` parameters — only the streamable HTTP transport changed. ### `StreamableHTTPTransport.protocol_version` attribute removed The transport no longer holds per-connection protocol state; era-dependent headers (e.g. `MCP-Protocol-Version`) are now supplied per-message by the session. If you were reading `transport.protocol_version` to learn the negotiated version, read `session.protocol_version` (or `client.protocol_version` on the high-level `Client`) instead. The `MCP_PROTOCOL_VERSION` header-name constant has moved: import `MCP_PROTOCOL_VERSION_HEADER` from `mcp.shared.inbound` instead of `MCP_PROTOCOL_VERSION` from `mcp.client.streamable_http`. ### `terminate_windows_process` removed The deprecated `mcp.os.win32.utilities.terminate_windows_process` function has been removed. Process termination is handled internally by the `stdio_client` context manager; there is no replacement API. The Windows tree-termination helper `terminate_windows_process_tree` no longer accepts a `timeout_seconds` argument — the value was never used (Job Object termination is immediate). ### `stdio_client` no longer kills children of a gracefully-exited server on POSIX When a server exits on its own after `stdio_client` closes its stdin, background child processes the server leaves behind are no longer killed on POSIX — their lifetime is the server's business. The old behavior was a side effect of a shutdown wait gated on the stdio pipes closing rather than on process exit: a child holding an inherited pipe made a well-behaved server look hung, so its whole process tree was killed. (That gating is an asyncio behavior specific to Python 3.11+ — on Python 3.10 and the trio backend the old wait already resolved on process exit, so the spurious kill never fired there.) A server that does not exit within the grace period is still terminated along with its entire process group. On Windows, children stay in the server's Job Object and are still killed at shutdown — now deterministically when the job handle is closed, rather than whenever the handle happened to be garbage-collected. If you relied on `stdio_client` killing everything the server spawned, make the server terminate its own children on shutdown (its stdin reaching EOF is the shutdown signal), or clean up the process tree from the host application after `stdio_client` exits. Two related shutdown refinements: `stdio_client` now closes its end of the pipes deterministically at shutdown, so a surviving child that keeps writing to an inherited stdout receives `EPIPE`/`SIGPIPE` once the client is gone (previously the pipe lingered until garbage collection); and a failed write to a server that is still running now surfaces as a closed connection (`CONNECTION_CLOSED`) on the read side instead of leaving requests waiting indefinitely. `terminate_posix_process_tree` now requires the process to lead its own process group (spawned with `start_new_session=True`); the `getpgid()` lookup and the per-process terminate/kill fallback are gone. The win32 utilities logger is now named `mcp.os.win32.utilities` (was `client.stdio.win32`). ### WebSocket transport removed The WebSocket transport has been removed: `mcp.client.websocket.websocket_client`, `mcp.server.websocket.websocket_server`, and the `ws` optional dependency extra (`mcp[ws]`) no longer exist. WebSocket was never part of the MCP specification. Use the streamable HTTP transport instead (`mcp.client.streamable_http.streamable_http_client` on the client, `streamable_http_app()` on the server), which supports bidirectional communication with server-to-client streaming over standard HTTP. ### `mcp.types` moved to the `mcp-types` package The protocol wire types now live in a standalone distribution, `mcp-types`, imported as `mcp_types`. Its only runtime dependencies are `pydantic` and `typing-extensions`, so code that just needs to (de)serialize MCP traffic can install it without the full SDK. The `mcp` package depends on `mcp-types` and continues to re-export the type names at the top level, so `from mcp import Tool` is unchanged. Only the `mcp.types` submodule and `mcp.shared.version` were removed. The package's API reference is at [`mcp_types`](https://py.sdk.modelcontextprotocol.io/v2/api/mcp_types/). **Why:** keeping the wire types in their own package lets tooling and lightweight clients depend on the protocol schema without pulling in `httpx`, `starlette`, `uvicorn`, and the rest of the server/transport stack. **Before (v1):** ```python from mcp.types import Tool, Resource from mcp.shared.version import LATEST_PROTOCOL_VERSION ``` **After (v2):** ```python from mcp_types import Tool, Resource from mcp_types.version import LATEST_PROTOCOL_VERSION # Names `mcp` already re-exported at the top level are unchanged: from mcp import Tool, Resource ``` ### Removed type aliases and classes The following deprecated type aliases and classes have been removed from `mcp_types`: | Removed | Replacement | |---------|-------------| | `Content` | `ContentBlock` | | `ResourceReference` | `ResourceTemplateReference` | | `Cursor` | Use `str` directly | | `MethodT` | Internal TypeVar, not intended for public use | | `RequestParamsT` | Internal TypeVar, not intended for public use | | `NotificationParamsT` | Internal TypeVar, not intended for public use | **Before (v1):** ```python from mcp.types import Content, ResourceReference, Cursor ``` **After (v2):** ```python from mcp_types import ContentBlock, ResourceTemplateReference # Use `str` instead of `Cursor` for pagination cursors ``` ### Field names changed from camelCase to snake_case All Pydantic model fields in `mcp_types` now use snake_case names for Python attribute access. The JSON wire format is unchanged — serialization still uses camelCase via Pydantic aliases. **Before (v1):** ```python result = await session.call_tool("my_tool", {"x": 1}) if result.isError: ... tools = await session.list_tools() cursor = tools.nextCursor schema = tools.tools[0].inputSchema ``` **After (v2):** ```python result = await session.call_tool("my_tool", {"x": 1}) if result.is_error: ... tools = await session.list_tools() cursor = tools.next_cursor schema = tools.tools[0].input_schema ``` Common renames: | v1 (camelCase) | v2 (snake_case) | |----------------|-----------------| | `inputSchema` | `input_schema` | | `outputSchema` | `output_schema` | | `isError` | `is_error` | | `nextCursor` | `next_cursor` | | `mimeType` | `mime_type` | | `structuredContent` | `structured_content` | | `serverInfo` | `server_info` | | `protocolVersion` | `protocol_version` | | `uriTemplate` | `uri_template` | | `listChanged` | `list_changed` | | `progressToken` | `progress_token` | Because `populate_by_name=True` is set, the old camelCase names still work as constructor kwargs (e.g., `Tool(inputSchema={...})` is accepted), but attribute access must use snake_case (`tool.input_schema`). ### Server handler results are validated against the protocol schema Results returned from server handlers are now validated against the negotiated protocol version's schema before being sent. A result that does not conform raises on the server side and the client receives an `INTERNAL_ERROR` response. The case most existing code will hit is `Tool.inputSchema`: the spec requires it to contain `"type": "object"`, so an empty `{}` is now rejected. ### Client validates inbound traffic against the protocol schema `ClientSession` now validates server requests, notifications, and results against the negotiated protocol version's schema before parsing them into `mcp_types` models. Spec-invalid server output that the previous monolith parse tolerated may now raise `pydantic.ValidationError` from `list_tools()`, `call_tool()`, and similar calls. `_meta` remains the sanctioned place for result extras (and `experimental` for capability extras). ### `args` parameter removed from `ClientSessionGroup.call_tool()` The deprecated `args` parameter has been removed from `ClientSessionGroup.call_tool()`. Use `arguments` instead. **Before (v1):** ```python result = await session_group.call_tool("my_tool", args={"key": "value"}) ``` **After (v2):** ```python result = await session_group.call_tool("my_tool", arguments={"key": "value"}) ``` ### `cursor` parameter removed from `ClientSession` list methods The deprecated `cursor` parameter has been removed from the following `ClientSession` methods: - `list_resources()` - `list_resource_templates()` - `list_prompts()` - `list_tools()` Use `params=PaginatedRequestParams(cursor=...)` instead. **Before (v1):** ```python result = await session.list_resources(cursor="next_page_token") result = await session.list_tools(cursor="next_page_token") ``` **After (v2):** ```python from mcp_types import PaginatedRequestParams result = await session.list_resources(params=PaginatedRequestParams(cursor="next_page_token")) result = await session.list_tools(params=PaginatedRequestParams(cursor="next_page_token")) ``` ### `ClientSession.get_server_capabilities()` replaced by era-neutral accessors `ClientSession` now exposes the negotiated server metadata as properties: `server_capabilities`, `server_info`, `instructions`, and `protocol_version`. These are populated by whichever connection step ran (`initialize()` for ≤2025-11-25 servers, `discover()` for 2026-07-28+), and are `None` if none has — matching v1's `get_server_capabilities()`. The `get_server_capabilities()` method has been removed. **Before (v1):** ```python capabilities = session.get_server_capabilities() # server_info, instructions, protocol_version were not stored — had to capture initialize() return value ``` **After (v2):** ```python capabilities = session.server_capabilities server_info = session.server_info instructions = session.instructions version = session.protocol_version ``` The raw handshake result is also retained: `session.initialize_result` is set after `initialize()` (≤2025-11-25 servers — including `stateless_http=True` servers, which still answer `initialize`); `session.discover_result` is set after `discover()` (2026-07-28+ servers). At most one is non-`None`. On the high-level `Client`, `client.server_capabilities`, `client.server_info`, and `client.protocol_version` are non-nullable inside the context manager. `client.instructions` remains `str | None` since the server may omit it. (The lowlevel `ClientSession` still lets you call methods before any handshake, as in v1; `Client` always connects on enter — by default it probes `server/discover` and falls back to the initialize handshake.) ### `Client` defaults to `mode='auto'` In v1, connecting to a server always performed the `initialize` handshake. In v2, `Client` defaults to `mode='auto'`: on enter it probes `server/discover` and, if the server doesn't support it, falls back to the `initialize` handshake. Pass `mode='legacy'` to force the initialize handshake and reproduce v1's byte-identical pre-2026 behavior, or pass a modern protocol-version string (e.g. `mode='2026-07-28'`) to pin a version without probing. For an in-process `Client(server)` (where `server` is a `Server` or `MCPServer` instance), `mode='auto'` dispatches calls directly through `DirectDispatcher` with no JSON-RPC framing. Pass `mode='legacy'` if you need the in-memory JSON-RPC transport that v1 used. `Client.send_ping()` is deprecated (ping is removed in 2026-07-28); pin `mode='legacy'` if you need it. ### `InputRequiredResult` handling differs between `Client` and `ClientSession` For protocol 2026-07-28, `tools/call`, `prompts/get`, and `resources/read` may return an `InputRequiredResult` asking the client to supply additional input (sampling, elicitation, roots) and retry. On the high-level `Client`, `call_tool`, `get_prompt`, and `read_resource` resolve this automatically: they dispatch each requested input to the matching callback (`sampling_callback`, `elicitation_callback`, `list_roots_callback`) and retry until a final result is returned, so the call still returns the bare `CallToolResult` / `GetPromptResult` / `ReadResourceResult`. The round limit is `Client(input_required_max_rounds=...)` (default 10). Earlier v2 prereleases exposed an `allow_input_required` parameter on these `Client` methods; that parameter has been removed. For manual control use `client.session.call_tool(..., allow_input_required=True)`. Note that `read_timeout_seconds` now bounds each underlying round, not the whole loop; wrap the call in `anyio.fail_after(...)` for a whole-loop bound. On `ClientSession`, `call_tool` / `get_prompt` / `read_resource` still return the bare result and raise `RuntimeError` if the server requests input. Pass `allow_input_required=True` to receive the `InputRequiredResult` instead, then drive the loop yourself with `input_responses=` / `request_state=`. `ClientSessionGroup.call_tool` accepts the same flag. ### `call_tool` mirrors `x-mcp-header` arguments into `Mcp-Param-*` headers ([SEP-2243](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2243)) For protocol 2026-07-28 over Streamable HTTP, a tool's input-schema property may carry an `x-mcp-header` annotation. When a tool the client has listed is called, each annotated argument is mirrored into an `Mcp-Param-` request header (string verbatim, integer as decimal, boolean as `true`/`false`, base64-sentinel-wrapped when not header-safe; `null`/absent arguments — and values with no scalar rendering, such as objects or arrays — are omitted). The argument is also left in the request body. `list_tools` caches a tool's annotations, so list a tool before calling it to enable mirroring; a tool the client never listed emits no `Mcp-Param-*` headers. Other transports ignore the annotation. ### Servers validate `Mcp-Param-*` headers against the request body ([SEP-2243](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2243)) The server half of the same contract: on the 2026-07-28 Streamable HTTP path, a `tools/call` whose tool declares `x-mcp-header` annotations is validated before dispatch — each annotated argument and its mirroring `Mcp-Param-*` header must be present together and agree (after base64-sentinel decoding; integers compare numerically), or absent together. A violation is rejected with HTTP 400 and JSON-RPC error `-32020` (`HeaderMismatch`), as the spec requires. A client that sends an annotated argument *without* its header — for example one that never listed the tool — is therefore rejected instead of silently served; the spec's recovery is to re-list and retry. There is nothing to configure. The server resolves the called tool's schema through its own registered `tools/list` handler (for `MCPServer`, the built-in one), so the validated catalog is exactly what that caller would be shown. Two consequences worth knowing: the listing runs internally on validated calls, so middleware and an expensive or paginated `tools/list` handler see extra invocations; and validation is skipped — never failing the call — when no `tools/list` handler is registered, the tool isn't in the listing, the handler raises (logged as an error), or the call has no arguments and no `Mcp-Param-*` headers. Headers with no matching annotation are ignored; a recognized header supplied more than once is rejected, as is a duplicated `MCP-Protocol-Version`, `Mcp-Method`, or `Mcp-Name` line. The codec and validator are public in `mcp.shared.inbound` (`decode_header_value`, `validate_mcp_param_headers`) for low-level servers hosting their own HTTP entry. Base64-sentinel decoding is strict everywhere it applies, including the `Mcp-Name` header: a `=?base64?...?=` value whose payload is not canonical base64 (wrong padding, stray characters, non-zero trailing bits) or not valid UTF-8 is rejected as malformed rather than leniently decoded. ### `Client` verbs may serve cached responses ([SEP-2549](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2549)) On protocol 2026-07-28, servers attach caching hints (`ttlMs`, `cacheScope`) to the cacheable results, and `Client` now honors them: `list_tools`, `list_prompts`, `list_resources`, `list_resource_templates`, and `read_resource` may serve a cached response instead of making a round trip, for as long as the server's `ttlMs` says the result is fresh. With the default configuration, servers that send no hints, including every pre-2026 server, see identical call-for-call behavior, because hint-less results are not cached (a `CacheConfig.default_ttl_ms` above zero caches them too). Pass `Client(..., cache=False)` to disable the cache and restore v1 behavior exactly; per-call control (`cache_mode`) and configuration (`CacheConfig`) are described in [Caching hints](https://py.sdk.modelcontextprotocol.io/v2/advanced/caching/index.md). ### Server extensions API ([SEP-2133](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2133)) `MCPServer` now accepts opt-in extensions that bundle MCP behaviour behind a reverse-DNS identifier and advertise it under `ServerCapabilities.extensions` (the 2026-07-28 capability map). An extension subclasses `mcp.server.extension.Extension` and overrides only the contribution methods it needs: `tools()`/`resources()`/`methods()` (additive) and `intercept_tool_call()` (wraps `tools/call`). The `identifier` must be a `vendor-prefix/name` string following the spec's `_meta` key grammar; a class-level `identifier` is validated when the subclass is defined, one assigned in `__init__` when the extension is registered. Pass instances at construction: ```python from mcp.server.mcpserver import MCPServer from mcp.server.apps import Apps mcp = MCPServer("demo", extensions=[Apps()]) ``` The reference extension is `mcp.server.apps.Apps` (`io.modelcontextprotocol/ui`): it binds a tool to a `ui://` UI resource via `_meta.ui.resourceUri`, and `client_supports_apps(ctx)` gates the SEP-2133 text-only fallback — `True` only when the client's ui-extension settings list the `text/html;profile=mcp-app` MIME type, per the Apps spec's required `mimeTypes` field. Every `@apps.tool(resource_uri=...)` must have a matching resource registered on the same `Apps` instance (`add_html_resource` for inline HTML, `add_resource` for a pre-built `Resource`); a tool bound to an unregistered URI raises at `MCPServer(...)` construction rather than 404ing on `resources/read` at runtime. Extension methods are strictly additive: a `MethodBinding` cannot name a spec-defined request method, and registering one whose method collides with another handler raises at construction. A `MethodBinding` may set `protocol_versions` to scope an extension method to specific wire versions (`frozenset()` is rejected — use `None` to admit every version); a request at any other version is `METHOD_NOT_FOUND`. An extension handler can call `mcp.server.mcpserver.require_client_extension(ctx, identifier)` to reject a request with the `-32021` (missing required client capability) error when the client did not declare the extension. On the client, `Client(extensions=...)` takes a sequence of `mcp.client.ClientExtension` instances. A client extension contributes its capability ad (mirrored into `ClientCapabilities.extensions`), its result claims (extra `tools/call` result shapes that `Client.call_tool` resolves transparently through the claim's resolver), and its notification bindings (handlers for vendor server notifications). The capability map rides `server/discover` and every modern request's `_meta` envelope; a legacy `initialize` handshake carries only the claim-less identifiers, since claimed result shapes cannot be delivered on a legacy wire. Extensions are off by default and never alter behaviour unless registered. (The low-level `ClientSession(extensions=...)` keeps the raw identifier-to-settings dict.) Changed in the v2 pre-releases: earlier alphas took `Client(extensions={identifier: settings})`, an advertisement-only dict. Extensions now contribute behaviour (claims and notification handlers), not just an ad, so the argument is a sequence of declaration objects. An ad-only entry becomes an `advertise()` call: **Before (v2 alphas):** ```python client = Client(server, extensions={"com.example/ui": {"mimeTypes": [...]}}) ``` **After:** ```python from mcp.client import advertise client = Client(server, extensions=[advertise("com.example/ui", {"mimeTypes": [...]})]) ``` `advertise()` is only for identifiers with no client-side behaviour. For a behavioural extension (e.g. tasks, once its extension ships), construct that extension's object instead; advertising an identifier you do not implement asserts wire support you don't have. ### `McpError` renamed to `MCPError` The `McpError` exception class has been renamed to `MCPError` for consistent naming with the MCP acronym style used throughout the SDK. **Before (v1):** ```python from mcp.shared.exceptions import McpError try: result = await session.call_tool("my_tool") except McpError as e: print(f"Error: {e.error.message}") ``` **After (v2):** ```python from mcp.shared.exceptions import MCPError try: result = await session.call_tool("my_tool") except MCPError as e: print(f"Error: {e.message}") ``` `MCPError` is also exported from the top-level `mcp` package: ```python from mcp import MCPError ``` The constructor signature also changed — it now takes `code`, `message`, and optional `data` directly instead of wrapping an `ErrorData`: **Before (v1):** ```python from mcp.shared.exceptions import McpError from mcp.types import ErrorData, INVALID_REQUEST raise McpError(ErrorData(code=INVALID_REQUEST, message="bad input")) ``` **After (v2):** ```python from mcp.shared.exceptions import MCPError from mcp_types import INVALID_REQUEST raise MCPError(INVALID_REQUEST, "bad input") # or, if you already have an ErrorData: raise MCPError.from_error_data(error_data) ``` ### `FastMCP` renamed to `MCPServer` The `FastMCP` class has been renamed to `MCPServer` to better reflect its role as the main server class in the SDK. This is a simple rename with no functional changes to the class itself. **Before (v1):** ```python from mcp.server.fastmcp import FastMCP mcp = FastMCP("Demo") ``` **After (v2):** ```python from mcp.server.mcpserver import MCPServer, Context mcp = MCPServer("Demo") ``` `Context` is the type annotation for the `ctx` parameter injected into tools, resources, and prompts (see [`get_context()` removed](#mcpserverget_context-removed) below). All submodules under `mcp.server.fastmcp.*` are now under `mcp.server.mcpserver.*` with the same structure. Common imports: - `Image`, `Audio` — from `mcp.server.mcpserver` (or `.utilities.types`) - `UserMessage`, `AssistantMessage` — from `mcp.server.mcpserver.prompts.base` - `ToolError`, `ResourceError` — from `mcp.server.mcpserver.exceptions` ### `mount_path` parameter removed from MCPServer The `mount_path` parameter has been removed from `MCPServer.__init__()`, `MCPServer.run()`, `MCPServer.run_sse_async()`, and `MCPServer.sse_app()`. It was also removed from the `Settings` class. This parameter was redundant because the SSE transport already handles sub-path mounting via ASGI's standard `root_path` mechanism. When using Starlette's `Mount("/path", app=mcp.sse_app())`, Starlette automatically sets `root_path` in the ASGI scope, and the `SseServerTransport` uses this to construct the correct message endpoint path. ### Transport-specific parameters moved from MCPServer constructor to run()/app methods Transport-specific parameters have been moved from the `MCPServer` constructor to the `run()`, `sse_app()`, and `streamable_http_app()` methods. This provides better separation of concerns - the constructor now only handles server identity and authentication, while transport configuration is passed when starting the server. **Parameters moved:** - `host`, `port` - HTTP server binding - `sse_path`, `message_path` - SSE transport paths - `streamable_http_path` - StreamableHTTP endpoint path - `json_response`, `stateless_http` - StreamableHTTP behavior - `event_store`, `retry_interval` - StreamableHTTP event handling - `transport_security` - DNS rebinding protection **Before (v1):** ```python from mcp.server.fastmcp import FastMCP # Transport params in constructor mcp = FastMCP("Demo", json_response=True, stateless_http=True) mcp.run(transport="streamable-http") # Or for SSE mcp = FastMCP("Server", host="0.0.0.0", port=9000, sse_path="/events") mcp.run(transport="sse") ``` **After (v2):** ```python from mcp.server.mcpserver import MCPServer # Transport params passed to run() mcp = MCPServer("Demo") mcp.run(transport="streamable-http", json_response=True, stateless_http=True) # Or for SSE mcp = MCPServer("Server") mcp.run(transport="sse", host="0.0.0.0", port=9000, sse_path="/events") ``` **For mounted apps:** When mounting in a Starlette app, pass transport params to the app methods: ```python # Before (v1) from mcp.server.fastmcp import FastMCP mcp = FastMCP("App", json_response=True) app = Starlette(routes=[Mount("/", app=mcp.streamable_http_app())]) # After (v2) from mcp.server.mcpserver import MCPServer mcp = MCPServer("App") app = Starlette(routes=[Mount("/", app=mcp.streamable_http_app(json_response=True))]) ``` **Note:** DNS rebinding protection is automatically enabled when `host` is `127.0.0.1`, `localhost`, or `::1`. This now happens in `sse_app()` and `streamable_http_app()` instead of the constructor. If you were mutating these via `mcp.settings` after construction (e.g., `mcp.settings.port = 9000`), pass them to `run()` / `sse_app()` / `streamable_http_app()` instead — these fields no longer exist on `Settings`. The `debug` and `log_level` parameters remain on the constructor. ### Streamable HTTP: lifespan now entered once at manager startup When serving streamable HTTP (stateful or `stateless_http=True`), the server's `lifespan` context manager is now entered once when `StreamableHTTPSessionManager.run()` starts, and the resulting state is shared across all sessions and requests. Previously each session (stateful) or each request (stateless) entered and exited `lifespan` independently. Lifespans that set up process-wide state (connection pools, caches, background tasks) are unaffected — they now run once instead of per session/request. If your lifespan was acquiring per-connection resources, move that acquisition into the handler body; per-connection cleanup belongs on the connection's `exit_stack` (the public surface for reaching it from high-level `@mcp.tool()` handlers is being finalised as part of the public-surface review). ### `Server.run()` no longer takes a `stateless` flag; `StatelessModeNotSupported` removed The `stateless: bool` parameter on the lowlevel `Server.run()` has been removed. Stateless serving is now a property of how the connection is constructed (the streamable-HTTP manager builds a born-ready `Connection` per request), not a flag the loop driver inspects. `StatelessModeNotSupported` has been removed. Server-initiated requests that have no channel to travel on now raise `NoBackChannelError` (an `MCPError` subclass) — the same exception regardless of why the channel is absent. If you were catching `StatelessModeNotSupported`, catch `NoBackChannelError` instead. ### `MCPServer.get_context()` removed `MCPServer.get_context()` has been removed. Context is now injected by the framework and passed explicitly — there is no ambient ContextVar to read from. **If you were calling `get_context()` from inside a tool/resource/prompt:** use the `ctx: Context` parameter injection instead. **Before (v1):** ```python @mcp.tool() async def my_tool(x: int) -> str: ctx = mcp.get_context() await ctx.info("Processing...") return str(x) ``` **After (v2):** ```python from mcp.server.mcpserver import Context @mcp.tool() async def my_tool(x: int, ctx: Context) -> str: await ctx.info("Processing...") return str(x) ``` ### `MCPServer.call_tool()`, `read_resource()`, `get_prompt()` now accept a `context` parameter `MCPServer.call_tool()`, `MCPServer.read_resource()`, and `MCPServer.get_prompt()` now accept an optional `context: Context | None = None` parameter. The framework passes this automatically during normal request handling. If you call these methods directly and omit `context`, a Context with no active request is constructed for you — tools that don't use `ctx` work normally, but any attempt to use `ctx.session`, `ctx.request_id`, etc. will raise. The internal layers (`ToolManager.call_tool`, `Tool.run`, `Prompt.render`, `ResourceTemplate.create_resource`, etc.) now require `context` as a positional argument. ### Resource not found returns `-32602` and resource lookups raise typed exceptions (SEP-2164) Reading a missing resource now returns JSON-RPC error code `-32602` (invalid params) with the requested URI in `error.data` (`{"uri": ...}`), per [SEP-2164](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2164). Previously the server returned code `0` with no `data`. Clients can now reliably distinguish not-found from other errors; a template handler that raises `ResourceNotFoundError` (from `mcp.server.mcpserver.exceptions`) produces this same response. The underlying lookups now raise typed exceptions instead of `ValueError`. `ResourceManager.get_resource()` raises `ResourceNotFoundError` when no resource or template matches the URI, and `ResourceTemplate.create_resource()` raises `ResourceError` when the template function fails. Neither subclasses `ValueError`, so callers catching `ValueError` should switch to `ResourceNotFoundError` / `ResourceError` (both importable from `mcp.server.mcpserver.exceptions`; `ResourceNotFoundError` subclasses `ResourceError`). ### Resource templates: matching behavior changes Resource template matching has been rewritten with [RFC 6570](https://datatracker.ietf.org/doc/html/rfc6570) support. Several behaviors have changed: **Path-safety checks applied by default.** Extracted parameter values containing `..` as a path component, a null byte, or looking like an absolute path (`/etc/passwd`, `C:\Windows`) now cause the read to fail — the client receives an "Unknown resource" error and template iteration stops, so a strict template's rejection does not fall through to a later permissive template. This is checked on the decoded value, so `..%2Fetc`, `%2E%2E`, and `%00` are caught too. Note that `..` is only flagged as a standalone path component, so values like `v1.0..v2.0` or `HEAD~3..HEAD` are unaffected. If a parameter legitimately needs to receive absolute paths or traversal sequences, exempt it: ```python from mcp.server.mcpserver import ResourceSecurity @mcp.resource( "inspect://file/{+target}", security=ResourceSecurity(exempt_params={"target"}), ) def inspect_file(target: str) -> str: ... ``` **Template literals and structural delimiters match exactly.** The previous matcher built a regex without escaping, so `.` matched any character and simple `{var}` swallowed `?`, `#`, `&`, and `,`. Now `data://v1.0/{id}` no longer matches `data://v1X0/42`, and `api://{id}` no longer matches `api://foo?x=1` — use `api://{id}{?x}` to capture the query parameter. **`{var}` now matches an empty value.** A simple expression captures zero or more characters, so `tickets://{ticket_id}` now matches `tickets://` with `ticket_id=""` (v1.x's `[^/]+` regex required at least one). This makes `match` round-trip `expand` for empty values — RFC 6570 expands an empty string to nothing — but handlers that assumed a non-empty value should validate it explicitly. **Template syntax errors surface at decoration time.** Unclosed braces, duplicate variable names, and unsupported syntax raise `InvalidUriTemplate` when the decorator runs rather than `re.error` on first match. Two variables with no literal between them are also rejected — matching cannot tell where one ends and the next begins — so `{name}{+path}` raises. Write `{name}/{+path}`, or use an operator that emits its own delimiter: `{+path}{.ext}` is fine because the `.` operator contributes a literal `.` between the two. A handler parameter bound to a query variable in the template's trailing `{?...}`/`{&...}` run — the variables `match()` treats as optional, listed by `UriTemplate.query_variable_names` — must declare a Python default: a client may omit those, so a handler that requires one now raises `ValueError` when the decorator runs instead of failing on the first request that leaves it out. (A `{&...}` expression with no preceding `{?...}` is not in that run: it is matched strictly, may not be omitted, and needs no default.) **Static URIs with Context-only handlers now error.** A non-template URI paired with a handler that takes only a `Context` parameter previously registered but was silently unreachable (the resource could never be read). This now raises `ValueError` at decoration time. Context injection for static resources is not supported — use a template with at least one variable or access context through other means. See [URI templates](https://py.sdk.modelcontextprotocol.io/v2/advanced/uri-templates/index.md) for the full template syntax, security configuration, and filesystem safety utilities. ### Registering lowlevel handlers from `MCPServer` `MCPServer` does not expose public APIs for `subscribe_resource`, `unsubscribe_resource`, or `set_logging_level` handlers. In v1, the workaround was to reach into the private lowlevel server and use its decorator methods: **Before (v1):** ```python @mcp._mcp_server.set_logging_level() # pyright: ignore[reportPrivateUsage] async def handle_set_logging_level(level: str) -> None: ... mcp._mcp_server.subscribe_resource()(handle_subscribe) # pyright: ignore[reportPrivateUsage] ``` In v2, the lowlevel `Server` supports arbitrary request handlers directly via `add_request_handler` (the decorator methods are gone; handlers are otherwise constructor-only). From `MCPServer`, access it via `_lowlevel_server`: **After (v2):** ```python from mcp.server import ServerRequestContext from mcp_types import EmptyResult, SetLevelRequestParams, SubscribeRequestParams async def handle_set_logging_level(ctx: ServerRequestContext, params: SetLevelRequestParams) -> EmptyResult: ... return EmptyResult() async def handle_subscribe(ctx: ServerRequestContext, params: SubscribeRequestParams) -> EmptyResult: ... return EmptyResult() mcp._lowlevel_server.add_request_handler("logging/setLevel", SetLevelRequestParams, handle_set_logging_level) # pyright: ignore[reportPrivateUsage] mcp._lowlevel_server.add_request_handler("resources/subscribe", SubscribeRequestParams, handle_subscribe) # pyright: ignore[reportPrivateUsage] ``` `_lowlevel_server` is private and may change. A public way to register these handlers on `MCPServer` is planned; until then, use this workaround or use the lowlevel `Server` directly. ### `MCPServer`'s `Context` logging: `message` renamed to `data`, `extra` removed On the high-level `Context` object (`mcp.server.mcpserver.Context`), `log()`, `.debug()`, `.info()`, `.warning()`, and `.error()` now take `data: Any` instead of `message: str`, matching the MCP spec's `LoggingMessageNotificationParams.data` field which allows any JSON-serializable value. The `extra` parameter has been removed — pass structured data directly as `data`. The lowlevel `ServerSession.send_log_message(data: Any)` already accepted arbitrary data and is unchanged. `Context.log()` also now accepts all eight [RFC-5424](https://datatracker.ietf.org/doc/html/rfc5424) log levels (`debug`, `info`, `notice`, `warning`, `error`, `critical`, `alert`, `emergency`) via the `LoggingLevel` type, not just the four it previously allowed. ```python # Before await ctx.info("Connection failed", extra={"host": "localhost", "port": 5432}) await ctx.log(level="info", message="hello") # After await ctx.info({"message": "Connection failed", "host": "localhost", "port": 5432}) await ctx.log(level="info", data="hello") ``` Positional calls (`await ctx.info("hello")`) are unaffected. ### `Context.elicit()` schema gate validates the rendered schema `Context.elicit()` (and `elicit_with_validation()`) now render the schema first and validate each property against the spec's `PrimitiveSchemaDefinition`, raising `TypeError` at the call site for anything outside it. `Optional[T]` fields render as `{"type": ...}` with the field omitted from `required` (previously the non-spec `anyOf` shape). A bare `list[str]` field is rejected because it renders without the required enum items; use `list[Literal[...]]` or `list[str]` with `json_schema_extra` supplying the items. Unions of multiple primitives (e.g. `int | str`) and nested models are rejected. A schema-mismatched *accepted* answer also fails differently: the call now raises `ValueError` with a stable message ("Received an accepted elicitation whose content does not match the requested schema") instead of letting pydantic's `ValidationError` escape with its internals. Code that caught `ValidationError` around `ctx.elicit()` should catch `ValueError` (or rely on the tool's error result). ### Replace `RootModel` by union types with `TypeAdapter` validation The following union types are no longer `RootModel` subclasses: - `ClientRequest` - `ServerRequest` - `ClientNotification` - `ServerNotification` - `ClientResult` - `ServerResult` - `JSONRPCMessage` This means you can no longer access `.root` on these types or use `model_validate()` directly on them. Instead, use the provided `TypeAdapter` instances for validation. **Before (v1):** ```python from mcp.types import ClientRequest, ServerNotification # Using RootModel.model_validate() request = ClientRequest.model_validate(data) actual_request = request.root # Accessing the wrapped value notification = ServerNotification.model_validate(data) actual_notification = notification.root ``` **After (v2):** ```python from mcp_types import client_request_adapter, server_notification_adapter # Using TypeAdapter.validate_python() request = client_request_adapter.validate_python(data) # No .root access needed - request is the actual type notification = server_notification_adapter.validate_python(data) # No .root access needed - notification is the actual type ``` The same applies when constructing values — the wrapper call is no longer needed: **Before (v1):** ```python await session.send_notification(ClientNotification(InitializedNotification())) await session.send_request(ClientRequest(PingRequest()), EmptyResult) ``` **After (v2):** ```python await session.send_notification(InitializedNotification()) await session.send_request(PingRequest(), EmptyResult) ``` **Available adapters:** | Union Type | Adapter | |------------|---------| | `ClientRequest` | `client_request_adapter` | | `ServerRequest` | `server_request_adapter` | | `ClientNotification` | `client_notification_adapter` | | `ServerNotification` | `server_notification_adapter` | | `ClientResult` | `client_result_adapter` | | `ServerResult` | `server_result_adapter` | | `JSONRPCMessage` | `jsonrpc_message_adapter` | All adapters are exported from `mcp_types`. ### `RequestParams.Meta` replaced with `RequestParamsMeta` TypedDict The nested `RequestParams.Meta` Pydantic model class has been replaced with a top-level `RequestParamsMeta` TypedDict. This affects the `ctx.meta` field in request handlers and any code that imports or references this type. **Key changes:** - `RequestParams.Meta` (Pydantic model) → `RequestParamsMeta` (TypedDict) - Attribute access (`meta.progress_token`) → Dictionary access (`meta.get("progress_token")`) - `progress_token` field changed from `ProgressToken | None = None` to `NotRequired[ProgressToken]` **In request context handlers:** ```python # Before (v1) @server.call_tool() async def handle_tool(name: str, arguments: dict) -> list[TextContent]: ctx = server.request_context if ctx.meta and ctx.meta.progress_token: await ctx.session.send_progress_notification(ctx.meta.progress_token, 0.5, 100) # After (v2) async def handle_call_tool(ctx: ServerRequestContext, params: CallToolRequestParams) -> CallToolResult: if ctx.meta and "progress_token" in ctx.meta: await ctx.session.send_progress_notification(ctx.meta["progress_token"], 0.5, 100) ... server = Server("my-server", on_call_tool=handle_call_tool) ``` ### `RequestContext` type parameters simplified The `mcp.shared.context` module has been removed. `RequestContext` is now split into `ClientRequestContext` (in `mcp.client.context`) and `ServerRequestContext` (in `mcp.server.context`). **`RequestContext` changes:** - The `RequestContext[SessionT, LifespanContextT, RequestT]` generic no longer exists; use `ClientRequestContext` or `ServerRequestContext[LifespanContextT, RequestT]` - Server-specific fields (`lifespan_context`, `request`, `close_sse_stream`, `close_standalone_sse_stream`) moved to new `ServerRequestContext` class in `mcp.server.context` **Before (v1):** ```python from mcp.client.session import ClientSession from mcp.shared.context import RequestContext, LifespanContextT, RequestT # RequestContext with 3 type parameters ctx: RequestContext[ClientSession, LifespanContextT, RequestT] ``` **After (v2):** ```python from mcp.client.context import ClientRequestContext from mcp.server.context import ServerRequestContext, LifespanContextT, RequestT # For client-side context (sampling, elicitation, list_roots callbacks) ctx: ClientRequestContext # For server-specific context with lifespan and request types server_ctx: ServerRequestContext[LifespanContextT, RequestT] ``` `ServerRequestContext` is now a standalone dataclass — it no longer subclasses `RequestContext[ServerSession]`. It carries the same fields (`session`, `request_id`, `meta`, `lifespan_context`, `request`, `close_sse_stream`, `close_standalone_sse_stream`) plus new `protocol_version: str`, `method: str`, and raw `params: Mapping[str, Any] | None` fields (the last two let middleware read and rewrite the inbound message), so handler code is unaffected, but `isinstance(ctx, RequestContext)` checks and `RequestContext[ServerSession]` annotations need updating to `ServerRequestContext`. The high-level `Context` class (injected into `@mcp.tool()` etc.) similarly dropped its `ServerSessionT` parameter: `Context[ServerSessionT, LifespanContextT, RequestT]` → `Context[LifespanContextT, RequestT]`. Both remaining parameters have defaults, so bare `Context` is usually sufficient: **Before (v1):** ```python async def my_tool(ctx: Context[ServerSession, None]) -> str: ... ``` **After (v2):** ```python async def my_tool(ctx: Context) -> str: ... # or, with an explicit lifespan type: async def my_tool(ctx: Context[MyLifespanState]) -> str: ... ``` ### Version constants `SUPPORTED_PROTOCOL_VERSIONS` is deprecated — it's now the union of `HANDSHAKE_PROTOCOL_VERSIONS` (initialize-handshake versions) and `MODERN_PROTOCOL_VERSIONS` (per-request-envelope versions). If you were using it to mean "versions the initialize handshake accepts", switch to `HANDSHAKE_PROTOCOL_VERSIONS`. Named scalars derived from these tuples are now exported alongside them — `LATEST_HANDSHAKE_VERSION`, `LATEST_MODERN_VERSION`, `OLDEST_SUPPORTED_VERSION` — so prefer those over indexing the tuples directly. All of these live in `mcp_types.version` (previously `mcp.shared.version`): `from mcp_types.version import HANDSHAKE_PROTOCOL_VERSIONS`. ### `ProgressContext` and `progress()` context manager removed The `mcp.shared.progress` module (`ProgressContext`, `Progress`, and the `progress()` context manager) has been removed. This module had no real-world adoption — all users send progress notifications via `Context.report_progress()` or `session.send_progress_notification()` directly. **Before (v1):** ```python from mcp.shared.progress import progress with progress(ctx, total=100) as p: await p.progress(25) ``` **After — use `Context.report_progress()` (recommended):** ```python @server.tool() async def my_tool(x: int, ctx: Context) -> str: await ctx.report_progress(25, 100) return "done" ``` **After — use `session.send_progress_notification()` (low-level):** ```python await session.send_progress_notification( progress_token=progress_token, progress=25, total=100, ) ``` ### Handler progress reporting: prefer `ctx.report_progress()` over manual `progress_token` Reading `ctx.meta["progress_token"]` and calling `session.send_progress_notification(token, ...)` is specific to the JSON-RPC transport path. On the in-process modern path (`DirectDispatcher` / `Client(server)`), there is no wire token in `_meta`, so handlers that gate progress on the token's presence go silent. `ctx.report_progress(progress, total, message)` works on every dispatcher: it sends a progress notification when a token is present and routes the update through the dispatcher's progress channel otherwise, no-opping only when the caller did not request progress at all. `session.send_progress_notification(progress_token, ...)` is unchanged and still works on JSON-RPC transports for code that already holds a token. ### `create_connected_server_and_client_session` removed The `create_connected_server_and_client_session` helper in `mcp.shared.memory` has been removed. Use `mcp.client.Client` instead — it accepts a `Server` or `MCPServer` instance directly and handles the in-memory transport and session setup for you. **Before (v1):** ```python from mcp.shared.memory import create_connected_server_and_client_session async with create_connected_server_and_client_session(server) as session: result = await session.call_tool("my_tool", {"x": 1}) ``` **After (v2):** ```python from mcp.client import Client async with Client(server) as client: result = await client.call_tool("my_tool", {"x": 1}) ``` `Client` accepts the same callback parameters the old helper did (`sampling_callback`, `list_roots_callback`, `logging_callback`, `message_handler`, `elicitation_callback`, `client_info`) plus `raise_exceptions` to surface server-side errors and `mode` to control version negotiation (`'auto'` by default; `'legacy'` reproduces v1's initialize-only handshake). If you need direct access to the underlying `ClientSession` and memory streams (e.g., for low-level transport testing), `create_client_server_memory_streams` is still available in `mcp.shared.memory`: ```python import anyio from mcp.client.session import ClientSession from mcp.shared.memory import create_client_server_memory_streams async with create_client_server_memory_streams() as (client_streams, server_streams): async with anyio.create_task_group() as tg: tg.start_soon(lambda: server.run(*server_streams, server.create_initialization_options())) async with ClientSession(*client_streams) as session: await session.initialize() ... tg.cancel_scope.cancel() ``` ### Resource URI type changed from `AnyUrl` to `str` The `uri` field on resource-related types now uses `str` instead of Pydantic's `AnyUrl`. This aligns with the [MCP specification schema](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/schema/draft/schema.ts) which defines URIs as plain strings (`uri: string`) without strict URL validation. This change allows relative paths like `users/me` that were previously rejected. **Before (v1):** ```python from pydantic import AnyUrl from mcp.types import Resource # Required wrapping in AnyUrl resource = Resource(name="test", uri=AnyUrl("users/me")) # Would fail validation ``` **After (v2):** ```python from mcp_types import Resource # Plain strings accepted resource = Resource(name="test", uri="users/me") # Works resource = Resource(name="test", uri="custom://scheme") # Works resource = Resource(name="test", uri="https://example.com") # Works ``` If your code passes `AnyUrl` objects to URI fields, convert them to strings: ```python # If you have an AnyUrl from elsewhere uri = str(my_any_url) # Convert to string ``` Affected types: - `Resource.uri` - `ReadResourceRequestParams.uri` - `ResourceContents.uri` (and subclasses `TextResourceContents`, `BlobResourceContents`) - `SubscribeRequestParams.uri` - `UnsubscribeRequestParams.uri` - `ResourceUpdatedNotificationParams.uri` The `Client` and `ClientSession` methods `read_resource()`, `subscribe_resource()`, and `unsubscribe_resource()` now only accept `str` for the `uri` parameter. If you were passing `AnyUrl` objects, convert them to strings: ```python # Before (v1) from pydantic import AnyUrl await client.read_resource(AnyUrl("test://resource")) # After (v2) await client.read_resource("test://resource") # Or if you have an AnyUrl from elsewhere: await client.read_resource(str(my_any_url)) ``` ### Lowlevel `Server`: constructor parameters are now keyword-only All parameters after `name` are now keyword-only. If you were passing `version` or other parameters positionally, use keyword arguments instead: ```python # Before (v1) server = Server("my-server", "1.0") # After (v2) server = Server("my-server", version="1.0") ``` ### Lowlevel `Server`: type parameter reduced from 2 to 1 The `Server` class previously had two type parameters: `Server[LifespanResultT, RequestT]`. The `RequestT` parameter has been removed — handlers now receive typed params directly rather than a generic request type. ```python # Before (v1) from typing import Any from mcp.server.lowlevel.server import Server server: Server[dict[str, Any], Any] = Server(...) # After (v2) from typing import Any from mcp.server import Server server: Server[dict[str, Any]] = Server(...) ``` ### Lowlevel `Server`: `request_handlers` and `notification_handlers` attributes removed The public `server.request_handlers` and `server.notification_handlers` dictionaries have been removed. Handler registration is now done exclusively through constructor `on_*` keyword arguments. There is no public API to register handlers after construction. ```python # Before (v1) — direct dict access from mcp.types import ListToolsRequest if ListToolsRequest in server.request_handlers: ... # After (v2) — no public access to handler dicts # Use the on_* constructor params to register handlers server = Server("my-server", on_list_tools=handle_list_tools) ``` If you need to check whether a handler is registered, track this yourself — there is currently no public introspection API. ### Lowlevel `Server`: `add_request_handler` is now public and takes `params_type` The private `_add_request_handler(method, handler)` escape hatch is now the public `add_request_handler(method, params_type, handler)`, alongside a matching `add_notification_handler`. Each takes a `params_type` model that incoming params are validated against before the handler runs. A message with no `params` member validates `{}` against the model, so handlers never receive `None`: all-optional models arrive with their defaults, and models with required fields reject the message as `INVALID_PARAMS` before the handler runs (matching the Go SDK). ```python # Before (v1 / earlier v2 prereleases) server._add_request_handler("custom/method", my_handler) # After (v2) server.add_request_handler("custom/method", MyParams, my_handler) server.add_notification_handler("notifications/custom", MyNotifyParams, my_notify_handler) ``` ### Lowlevel `Server`: private `_handle_*` dispatch methods removed `Server._handle_message`, `_handle_request`, and `_handle_notification` have been removed. The receive loop and per-message dispatch now live in `JSONRPCDispatcher` and `ServerRunner`, which `Server.run()` drives internally. These were private, but some users subclassed `Server` and overrode them to intercept requests. Use middleware instead: ```python from typing import Any from mcp.server import Server, ServerRequestContext from mcp.server.context import CallNext, HandlerResult async def logging_middleware(ctx: ServerRequestContext[Any, Any], call_next: CallNext) -> HandlerResult: print(f"handling {ctx.method}") result = await call_next(ctx) print(f"done {ctx.method}") return result server = Server("my-server", on_call_tool=...) server.middleware.append(logging_middleware) ``` The method and the raw inbound params are `ctx.method` and `ctx.params` (`params` is `None` when the message carries none). Middleware runs before params validation and also wraps unknown methods. To rewrite the method or params before the handler runs, pass an adjusted context through: `await call_next(replace(ctx, params=...))`. ### Lowlevel `Server.run(raise_exceptions=True)`: transport errors no longer re-raised `raise_exceptions=True` now only governs handler exceptions: an exception raised by an `on_*` handler propagates out of `run()`. The JSON-RPC error response is still written to the client first, regardless of the flag. Previously it also re-raised exceptions yielded by the transport onto the read stream (e.g. JSON parse errors). Those are now debug-logged and dropped regardless of `raise_exceptions`. If you relied on `run()` exiting on a transport-level parse error, that no longer happens. ### Lowlevel `Server`: decorator-based handlers replaced with constructor `on_*` params The lowlevel `Server` class no longer uses decorator methods for handler registration. Instead, handlers are passed as `on_*` keyword arguments to the constructor. **Before (v1):** ```python from mcp.server.lowlevel.server import Server server = Server("my-server") @server.list_tools() async def handle_list_tools(): return [types.Tool(name="my_tool", description="A tool", inputSchema={})] @server.call_tool() async def handle_call_tool(name: str, arguments: dict): return [types.TextContent(type="text", text=f"Called {name}")] ``` **After (v2):** ```python from mcp.server import Server, ServerRequestContext from mcp_types import ( CallToolRequestParams, CallToolResult, ListToolsResult, PaginatedRequestParams, TextContent, Tool, ) async def handle_list_tools(ctx: ServerRequestContext, params: PaginatedRequestParams | None) -> ListToolsResult: return ListToolsResult(tools=[Tool(name="my_tool", description="A tool", input_schema={"type": "object"})]) async def handle_call_tool(ctx: ServerRequestContext, params: CallToolRequestParams) -> CallToolResult: return CallToolResult( content=[TextContent(type="text", text=f"Called {params.name}")], is_error=False, ) server = Server("my-server", on_list_tools=handle_list_tools, on_call_tool=handle_call_tool) ``` **Key differences:** - Handlers receive `(ctx, params)` instead of the full request object or unpacked arguments. `ctx` is a `ServerRequestContext` with `session` and `lifespan_context` fields (plus `request_id`, `meta`, etc. for request handlers). `params` is the typed request params object. - Handlers return the full result type (e.g. `ListToolsResult`) rather than unwrapped values (e.g. `list[Tool]`). - The automatic `jsonschema` input/output validation that the old `call_tool()` decorator performed has been removed. There is no built-in replacement — if you relied on schema validation in the lowlevel server, you will need to validate inputs yourself in your handler. **Complete handler reference:** All handlers receive `ctx: ServerRequestContext` as the first argument. The second argument and return type are: | v1 decorator | v2 constructor kwarg | `params` type | return type | |---|---|---|---| | `@server.list_tools()` | `on_list_tools` | `PaginatedRequestParams \| None` | `ListToolsResult` | | `@server.call_tool()` | `on_call_tool` | `CallToolRequestParams` | `CallToolResult` | | `@server.list_resources()` | `on_list_resources` | `PaginatedRequestParams \| None` | `ListResourcesResult` | | `@server.list_resource_templates()` | `on_list_resource_templates` | `PaginatedRequestParams \| None` | `ListResourceTemplatesResult` | | `@server.read_resource()` | `on_read_resource` | `ReadResourceRequestParams` | `ReadResourceResult` | | `@server.subscribe_resource()` | `on_subscribe_resource` | `SubscribeRequestParams` | `EmptyResult` | | `@server.unsubscribe_resource()` | `on_unsubscribe_resource` | `UnsubscribeRequestParams` | `EmptyResult` | | `@server.list_prompts()` | `on_list_prompts` | `PaginatedRequestParams \| None` | `ListPromptsResult` | | `@server.get_prompt()` | `on_get_prompt` | `GetPromptRequestParams` | `GetPromptResult` | | `@server.completion()` | `on_completion` | `CompleteRequestParams` | `CompleteResult` | | `@server.set_logging_level()` | `on_set_logging_level` | `SetLevelRequestParams` | `EmptyResult` | | — | `on_ping` | `RequestParams \| None` | `EmptyResult` | | `@server.progress_notification()` | `on_progress` | `ProgressNotificationParams` | `None` | | — | `on_roots_list_changed` | `NotificationParams \| None` | `None` | All `params` and return types are importable from `mcp_types`. **Notification handlers:** ```python from mcp.server import Server, ServerRequestContext from mcp_types import ProgressNotificationParams async def handle_progress(ctx: ServerRequestContext, params: ProgressNotificationParams) -> None: print(f"Progress: {params.progress}/{params.total}") server = Server("my-server", on_progress=handle_progress) ``` ### Lowlevel `Server`: automatic return value wrapping removed The old decorator-based handlers performed significant automatic wrapping of return values. This magic has been removed — handlers now return fully constructed result types. If you want these conveniences, use `MCPServer` (previously `FastMCP`) instead of the lowlevel `Server`. **`call_tool()` — structured output wrapping removed:** The old decorator accepted several return types and auto-wrapped them into `CallToolResult`: ```python # Before (v1) — returning a dict auto-wrapped into structured_content + JSON TextContent @server.call_tool() async def handle(name: str, arguments: dict) -> dict: return {"temperature": 22.5, "city": "London"} # Before (v1) — returning a list auto-wrapped into CallToolResult.content @server.call_tool() async def handle(name: str, arguments: dict) -> list[TextContent]: return [TextContent(type="text", text="Done")] ``` ```python # After (v2) — construct the full result yourself import json async def handle(ctx: ServerRequestContext, params: CallToolRequestParams) -> CallToolResult: data = {"temperature": 22.5, "city": "London"} return CallToolResult( content=[TextContent(type="text", text=json.dumps(data, indent=2))], structured_content=data, ) ``` Note: `params.arguments` can be `None` (the old decorator defaulted it to `{}`). Use `params.arguments or {}` to preserve the old behavior. **`read_resource()` — content type wrapping removed:** The old decorator auto-wrapped `Iterable[ReadResourceContents]` (and the deprecated `str`/`bytes` shorthand) into `TextResourceContents`/`BlobResourceContents`, handling base64 encoding and mime-type defaulting: ```python # Before (v1) — Iterable[ReadResourceContents] auto-wrapped from mcp.server.lowlevel.helper_types import ReadResourceContents @server.read_resource() async def handle(uri: AnyUrl) -> Iterable[ReadResourceContents]: return [ReadResourceContents(content="file contents", mime_type="text/plain")] # Before (v1) — str/bytes shorthand (already deprecated in v1) @server.read_resource() async def handle(uri: str) -> str: return "file contents" @server.read_resource() async def handle(uri: str) -> bytes: return b"\x89PNG..." ``` ```python # After (v2) — construct TextResourceContents or BlobResourceContents yourself import base64 async def handle_read(ctx: ServerRequestContext, params: ReadResourceRequestParams) -> ReadResourceResult: # Text content return ReadResourceResult( contents=[TextResourceContents(uri=str(params.uri), text="file contents", mime_type="text/plain")] ) async def handle_read(ctx: ServerRequestContext, params: ReadResourceRequestParams) -> ReadResourceResult: # Binary content — you must base64-encode it yourself return ReadResourceResult( contents=[BlobResourceContents( uri=str(params.uri), blob=base64.b64encode(b"\x89PNG...").decode("utf-8"), mime_type="image/png", )] ) ``` **`list_tools()`, `list_resources()`, `list_prompts()` — list wrapping removed:** The old decorators accepted bare lists and wrapped them into the result type: ```python # Before (v1) @server.list_tools() async def handle() -> list[Tool]: return [Tool(name="my_tool", ...)] # After (v2) async def handle(ctx: ServerRequestContext, params: PaginatedRequestParams | None) -> ListToolsResult: return ListToolsResult(tools=[Tool(name="my_tool", ...)]) ``` **Using `MCPServer` instead:** If you prefer the convenience of automatic wrapping, use `MCPServer` which still provides these features through its `@mcp.tool()`, `@mcp.resource()`, and `@mcp.prompt()` decorators. The lowlevel `Server` is intentionally minimal — it provides no magic and gives you full control over the MCP protocol types. ### Lowlevel `Server`: `request_context` property removed The `server.request_context` property has been removed. Request context is now passed directly to handlers as the first argument (`ctx`). The `request_ctx` module-level contextvar has been removed entirely. **Before (v1):** ```python from mcp.server.lowlevel.server import request_ctx @server.call_tool() async def handle_call_tool(name: str, arguments: dict): ctx = server.request_context # or request_ctx.get() await ctx.session.send_log_message(level="info", data="Processing...") return [types.TextContent(type="text", text="Done")] ``` **After (v2):** ```python from mcp.server import ServerRequestContext from mcp_types import CallToolRequestParams, CallToolResult, TextContent async def handle_call_tool(ctx: ServerRequestContext, params: CallToolRequestParams) -> CallToolResult: await ctx.session.send_log_message(level="info", data="Processing...") return CallToolResult( content=[TextContent(type="text", text="Done")], is_error=False, ) ``` ### `ServerRequestContext`: request-specific fields are now optional `ServerRequestContext` now uses optional fields for request-specific data (`request_id`, `meta`, etc.) so it can be used for both request and notification handlers. In notification handlers, these fields are `None`. ```python from mcp.server import ServerRequestContext # request_id, meta, etc. are available in request handlers # but None in notification handlers ``` ### `ServerSession` is now a thin proxy (no longer a `BaseSession`) `ServerSession` no longer subclasses `BaseSession`. It is now a small connection-scoped proxy that exposes `send_request`, `send_notification`, the typed convenience helpers (`create_message`, `elicit_form`, `send_log_message`, `send_tool_list_changed`, ...), `client_params`, `protocol_version`, and `check_client_capability`. The receive loop, `initialize` handling, and per-request task isolation that previously lived in `ServerSession` have moved to `JSONRPCDispatcher` and `ServerRunner`. `ServerSession` is normally constructed for you by `Server.run()` and reached via `ctx.session` in handlers, so most servers are unaffected. If you were constructing or subclassing it directly: **Constructor change:** ```python # Before (v1) session = ServerSession(read_stream, write_stream, init_options, stateless=False) # After (v2) session = ServerSession(request_outbound, connection) # where `request_outbound` is an Outbound and `connection` is a Connection ``` In practice, replace direct `ServerSession` use with `Server.run(read_stream, write_stream, init_options)` and let the framework wire it up. **Removed from `mcp.server.session`:** - `InitializationState` enum and `ServerSession._initialization_state` — initialization tracking is now on `Connection` (`connection.initialized` is an `anyio.Event`, `connection.client_params` holds the init params). - `ServerRequestResponder` type alias. - `ServerSession.incoming_messages` stream — there is no longer a public stream of inbound messages to iterate. Register handlers via the `on_*` constructor params (or `add_request_handler`) and use `Server.middleware` to observe every inbound request and notification (`initialize`, unknown methods, validation failures, and `notifications/initialized` included). - `ServerSession.__aenter__` / `__aexit__` — `ServerSession` is no longer an async context manager. - The private `_receive_loop`, `_received_request`, `_received_notification`, and `_handle_incoming` overrides — there is nothing to override on `ServerSession` anymore. To intercept inbound messages, use `Server.middleware` (see the `_handle_*` removal section above). ### `BaseSession` / `RequestResponder`: server-side cancellation tracking removed `BaseSession._in_flight` and the `RequestResponder` members that supported it (`cancel()`, the `cancelled` and `in_flight` properties, the `on_complete` constructor argument, and the internal `CancelScope`) have been removed. These existed to let `ServerSession` cancel a handler when a `CancelledNotification` arrived; `ServerSession` no longer drives a receive loop, so they were dead code. Inbound-cancellation handling for the server now lives in `JSONRPCDispatcher`. `BaseSession` itself has since been removed entirely; see the next section. ### `ClientSession` now runs on `JSONRPCDispatcher`; `BaseSession` removed `ClientSession`'s public surface is unchanged — same constructor, typed methods, manual `initialize()`, and async context-manager lifecycle — but `BaseSession`, the v1 receive loop underneath it, is removed with no shim. The engine now lives in `JSONRPCDispatcher` (`mcp.shared.jsonrpc_dispatcher`). To customize client behavior, use the `ClientSession` constructor callbacks, or pass a pre-built dispatcher via the new keyword-only `dispatcher=` constructor argument (e.g. a `DirectDispatcher` for in-process embedding). Behavior changes: - **Callbacks and notifications now run concurrently.** In v1 the receive loop processed one inbound message at a time, so callbacks ran inline and in order. Now each delivery starts in arrival order but runs as its own task. Server-initiated request callbacks (`sampling`, `elicitation`, `roots`) no longer block other traffic, may themselves send requests without deadlocking, and are interrupted if the server sends `notifications/cancelled` (the request is then answered with an error). Notification callbacks (`logging_callback`, `progress_callback`, `message_handler`) may interleave, and a `progress_callback` may run after the request it reports on has returned; there is no built-in bound on concurrent deliveries. Transport-level errors reach `message_handler` the same way, and a `message_handler` that raises is logged rather than fatal to the session. Callbacks that need strict sequencing must coordinate themselves. - **Timeouts**: a timed-out or abandoned request is now followed by `notifications/cancelled`, so the server stops the handler instead of leaving it running. - **A raising request callback** is answered with `code=0` and the exception text; v1 flattened every callback exception to `INVALID_PARAMS`. For a specific error response, return `ErrorData` (unchanged) or raise `MCPError`. One carve-out: pydantic's `ValidationError` is still answered with `INVALID_PARAMS`, as in v1. - **`send_request` before entering the context manager** raises `RuntimeError` immediately; v1 wrote to the transport and hung until the timeout. After the connection has closed it raises `MCPError` (`CONNECTION_CLOSED`) instead. `send_notification` before entry still works. - **`send_notification` no longer takes `related_request_id`, and `send_request` no longer accepts `ServerMessageMetadata`.** No client transport ever serialized these hints; progress and response correlation via `progressToken` and the request id is unaffected. - **Client callbacks now receive `mcp.client.ClientRequestContext`** (its `request_id` is always populated); the private `mcp.shared._context.RequestContext` generic is deleted. Annotations spelled `RequestContext[ClientSession]` become `ClientRequestContext`. `mcp.shared.session` is now a compatibility module: `ProgressFnT` is re-exported (its home is `mcp.shared.dispatcher`), and `RequestResponder` remains as a typing-only stub so `MessageHandlerFnT` annotations keep importing. `RequestResponder.respond()` no longer exists. ### Experimental Tasks support removed Tasks ([SEP-1686](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/1686)) have been removed from the MCP specification and are no longer part of this SDK. The `mcp.client.experimental`, `mcp.server.experimental`, `mcp.shared.experimental`, and `mcp.server.lowlevel.experimental` modules have been removed, along with the `experimental` properties on `ClientSession`, `ServerSession`, `Server`, and `ServerRequestContext`. The corresponding `Task*` types remain in `mcp_types` as types-only definitions. Tasks are expected to return as a separate MCP extension in a future release. ## Deprecations ### Roots, Sampling, and Logging methods deprecated (SEP-2577) [SEP-2577](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2577) deprecates the Roots, Sampling, and Logging features as of the 2026-07-28 spec. The deprecation is advisory only: there are no wire-level changes, capability negotiation is unchanged, and every method keeps working for sessions negotiating 2025-11-25 and earlier. The user-facing methods for these features now carry `typing_extensions.deprecated`, so type checkers, IDEs, and the runtime surface a deprecation warning where they are called: - Sampling: `ServerSession.create_message()`, `ClientPeer.sample()` - Roots: `ServerSession.list_roots()`, `ClientPeer.list_roots()`, `ClientSession.send_roots_list_changed()`, `Client.send_roots_list_changed()` - Logging: `ServerSession.send_log_message()`, `Connection.log()`, `ClientSession.set_logging_level()`, `Client.set_logging_level()`, `mcp.server.context.Context.log()` (the lowlevel `Context`), and the `MCPServer` `Context` helpers `log()`, `debug()`, `info()`, `warning()`, `error()` Registering a handler for a deprecated capability is deprecated too. The `Server.__init__` parameters `on_set_logging_level` (Logging) and `on_roots_list_changed` (Roots) are now split out into a `typing_extensions.deprecated` overload, so passing either is flagged by type checkers and emits `mcp.MCPDeprecationWarning` at construction time. `on_progress` follows the same pattern (see below). The non-deprecated overload omits these parameters, so the common case stays warning-free. The runtime warning is emitted as `mcp.MCPDeprecationWarning`, which subclasses `UserWarning` (not `DeprecationWarning`) so it is visible by default. To silence it, filter that category: ```python import warnings from mcp import MCPDeprecationWarning warnings.filterwarnings("ignore", category=MCPDeprecationWarning) ``` No migration is required during the deprecation window. New code should avoid building on these features, since they may be removed in a future spec version. ### Client-to-server progress deprecated (2026-07-28) The 2026-07-28 spec restricts `notifications/progress` to the server-to-client direction only — `ProgressNotification` is no longer in `ClientNotification`. `Client.send_progress_notification()` and `ClientSession.send_progress_notification()` now carry `typing_extensions.deprecated` and emit `mcp.MCPDeprecationWarning` at runtime. They continue to work against servers negotiating 2025-11-25 or earlier. On the server side, prefer the new dispatcher-agnostic `ServerSession.report_progress(progress, total, message)` (and `Context.report_progress()` on `MCPServer`) over the raw `ServerSession.send_progress_notification(progress_token, …)`. `report_progress` encapsulates the "no-op when the caller did not request progress" rule and works on every dispatcher; the raw token-taking form remains for handlers that read `_meta.progressToken` directly. ## Bug Fixes ### OAuth metadata URLs no longer gain a trailing slash `OAuthMetadata`, `ProtectedResourceMetadata`, and `OAuthClientMetadata` now set `url_preserve_empty_path=True` (Pydantic 2.12+). A path-less URL parsed from the wire keeps its empty path instead of acquiring a trailing slash, so e.g. an `issuer` of `https://as.example.com` round-trips as `https://as.example.com` rather than `https://as.example.com/`. This matters for [RFC 9207](https://datatracker.ietf.org/doc/html/rfc9207) / [RFC 8414](https://datatracker.ietf.org/doc/html/rfc8414) issuer comparisons, which require simple string comparison ([RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986) §6.2.1). URLs constructed in Python from an already-built `AnyHttpUrl` object are unaffected (they were normalized at construction); only values parsed from strings/JSON change. This also changes the wire form of `OAuthClientMetadata.redirect_uris`: a path-less redirect URI passed as a string (e.g. `redirect_uris=['http://localhost:8080']`) now serializes as `http://localhost:8080` instead of `http://localhost:8080/`, and the client sends it verbatim in the `/authorize` and token-exchange requests. [RFC 6749](https://datatracker.ietf.org/doc/html/rfc6749) §3.1.2.3 requires authorization servers to match redirect URIs by exact string comparison, so if you registered such a URI with a previous SDK release (with the trailing slash) and the registration is persisted in `TokenStorage`, re-register the client so the stored value matches what the SDK now transmits. `AuthSettings` now sets `url_preserve_empty_path=True` for the same reason: a path-less `issuer_url` (or `resource_server_url`) passed as a string keeps its empty path, so the authorization server advertises `issuer` as `https://as.example.com` rather than `https://as.example.com/` in its metadata. Previously the trailing slash was added before the model saw the value, leaving the served issuer inconsistent with what clients compare against under RFC 8414 / RFC 9207. Passing an already-built `AnyHttpUrl` object still normalizes at construction; pass a string to get the preserved form. ### Lowlevel `Server`: `subscribe` capability now correctly reported Previously, the lowlevel `Server` hardcoded `subscribe=False` in resource capabilities even when a `subscribe_resource()` handler was registered. The `subscribe` capability is now dynamically set to `True` when an `on_subscribe_resource` handler is provided. Clients that previously didn't see `subscribe: true` in capabilities will now see it when a handler is registered, which may change client behavior. ### Unknown request methods now return `-32601` (Method not found) In v1, a request for a method the SDK didn't recognize failed request-union validation and was answered with `-32602` (`"Invalid request parameters"`, empty `data`). Any method the receiver doesn't serve — unrecognized, or a spec method with no registered handler — is now answered with the JSON-RPC-specified `-32601` (`"Method not found"`), with the method name in `data`, on both the server and the client side, in every initialization state. Update anything that matched on the old code for this case. ### Extra fields on MCP types are no longer preserved In v1, MCP protocol types were configured with `extra="allow"`: unknown fields passed to a constructor or received from a peer were kept on the model and re-serialized on output. In v2, MCP types silently ignore extra fields. Unknown constructor keyword arguments and unknown keys in wire data are dropped during validation — no error is raised, and the values do not round-trip: ```python from mcp_types import CallToolRequestParams params = CallToolRequestParams( name="my_tool", arguments={}, unknown_field="value", # silently ignored, not stored ) "unknown_field" in params.model_dump() # False # _meta remains the supported place for custom data, per the MCP spec params = CallToolRequestParams( name="my_tool", arguments={}, _meta={"my_custom_key": "value", "another": 123}, # OK, preserved ) ``` If you relied on extra fields round-tripping through MCP types, move that data into `_meta`. ## New Features ### OAuth client credentials are bound to their authorization server ([SEP-2352](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2352)) Persisted OAuth client credentials are now bound to the authorization server that issued them: `OAuthClientInformationFull` records an `issuer`, set by the SDK after registration. When a server's protected resource metadata later points at a different authorization server, the client discards the bound credentials (and the old tokens) and re-registers with the new server instead of presenting one server's `client_id` to another. URL-based client IDs (CIMD) are portable and unaffected; credentials with no recorded issuer (pre-registered, or stored before this change) are left as-is. No API change for existing `TokenStorage` implementations - the `issuer` round-trips through the unchanged `get_client_info`/`set_client_info`. ### Step-up authorization unions previously requested scopes ([SEP-2350](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2350)) When a `403 insufficient_scope` challenge triggers step-up re-authorization, the OAuth client now requests the union of the previously requested scopes and the newly challenged scopes, instead of replacing the scope with only the challenged ones. This keeps permissions granted for earlier operations from being dropped when a later operation escalates. No API change; the wider scope is sent automatically on the re-authorization request. ### OAuth Dynamic Client Registration sends `application_type` ([SEP-837](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/837)) `OAuthClientMetadata` now carries an `application_type` field that is sent during Dynamic Client Registration. It defaults to `"native"`, which suits MCP clients that use loopback redirect URIs (CLI and desktop apps); browser-based clients served from a non-local host should set it to `"web"`: ```python from mcp.shared.auth import OAuthClientMetadata client_metadata = OAuthClientMetadata( redirect_uris=["https://app.example.com/callback"], application_type="web", ) ``` Under OIDC, omitting `application_type` defaults to `"web"`, which an authorization server may reject for the `localhost` redirect URIs native clients use; sending `"native"` avoids that. Non-OIDC servers ignore the parameter. ### Identity Assertion Authorization Grant for enterprise IdP flows (SEP-990) The SDK now supports [SEP-990](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/990)'s enterprise identity-provider policy controls. The client presents an Identity Assertion Authorization Grant (ID-JAG) - a signed JWT issued by the enterprise IdP - to the MCP authorization server using the [RFC 7523](https://datatracker.ietf.org/doc/html/rfc7523) jwt-bearer grant (`grant_type=urn:ietf:params:oauth:grant-type:jwt-bearer`, the ID-JAG as `assertion`), and receives an MCP access token. This matches the SEP-990 normative profile and interoperates with the other MCP SDKs. (Leg 1 - exchanging the user's IdP ID token for the ID-JAG against the IdP - is deployment-specific and out of scope for the SDK.) This is additive and opt-in on both sides; existing flows are unchanged. On the client, `IdentityAssertionOAuthProvider` (in `mcp.client.auth.extensions.identity_assertion`) is an `httpx.Auth` that posts the jwt-bearer request. The ID-JAG is supplied lazily through an async `assertion_provider(audience, resource)` callback - `audience` is the authorization server's issuer (the ID-JAG `aud`) and `resource` is the MCP server's identifier (the ID-JAG `resource` claim): ```python from mcp.client.auth.extensions.identity_assertion import IdentityAssertionOAuthProvider async def fetch_id_jag(audience: str, resource: str) -> str: # The ID-JAG must carry `audience` as `aud` and `resource` as its `resource` claim. return await my_idp.issue_id_jag(audience=audience, resource=resource) provider = IdentityAssertionOAuthProvider( server_url="https://mcp.example.com/mcp", storage=my_token_storage, client_id="enterprise-mcp-client", client_secret="enterprise-mcp-secret", issuer="https://auth.example.com", assertion_provider=fetch_id_jag, ) ``` SEP-990 §5.1 requires the client to authenticate; this SDK currently requires a shared secret, so `client_secret` is mandatory (`token_endpoint_auth_method` chooses `client_secret_post` (default) or `client_secret_basic`; the spec also permits `private_key_jwt`). The authorization server is configuration, not discovery: `issuer` is the AS the client is provisioned for, authorization-server metadata is fetched from that issuer's [RFC 8414](https://datatracker.ietf.org/doc/html/rfc8414) well-known, and the resource server is never asked which AS to use - so a hostile resource server cannot redirect the ID-JAG or secret. On the authorization server, set `AuthSettings(identity_assertion_enabled=True)` (or pass `identity_assertion_enabled=True` to `create_auth_routes`) and implement `exchange_identity_assertion` on your `OAuthAuthorizationServerProvider`. The method receives an `IdentityAssertionParams` (the ID-JAG `assertion`, requested scopes, and request `resource`) and returns a plain [RFC 6749](https://datatracker.ietf.org/doc/html/rfc6749) `OAuthToken`. The flag gates both metadata advertisement and the token endpoint: when off, `/token` rejects the grant with `unsupported_grant_type` even if the provider implements the hook. When on, the metadata advertises the jwt-bearer grant and the `urn:ietf:params:oauth:grant-profile:id-jag` profile in `authorization_grant_profiles_supported` (the discovery mechanism per ext-auth §6). The implementation is responsible for validating the assertion per RFC 7523 §3 and SEP-990 §5.1 - verify the signature/`iss`/`exp`/`typ`, require `aud` to be this AS, require the ID-JAG's `client_id` claim to match the authenticated client, audience-restrict the issued token to the ID-JAG's `resource` claim (not the client-controlled request `resource`), and derive scopes from the ID-JAG rather than granting the request verbatim. See `examples/snippets/servers/identity_assertion_server.py`, which fails closed. Two hardening points are enforced by the SDK: the handler rejects clients without a stored secret before calling the hook (and `ClientAuthenticator` itself now refuses a secret-based auth method registered without a secret), and Dynamic Client Registration refuses the jwt-bearer grant so the ID-JAG flow requires a pre-registered confidential client. ### 2025-11-25 and 2026-07-28 protocol fields modeled `mcp_types` models the 2025-11-25 and 2026-07-28 protocol fields (e.g. `resultType`, `ttlMs`/`cacheScope` on cacheable results, `inputResponses`/`requestState` on retried requests), so inbound payloads carrying these keys parse into typed fields and round-trip. `ttlMs`/`cacheScope` default to `0`/`"private"` (immediately stale, not shared-cacheable); `resultType` defaults to `"complete"` on concrete results (`None` on `EmptyResult`); the server strips all of them from the wire at pre-2026 versions. Servers set per-method values with `cache_hints={method: CacheHint(...)}` on the `Server`/`MCPServer` constructor — see [Caching hints](https://py.sdk.modelcontextprotocol.io/v2/advanced/caching/index.md). ### `streamable_http_app()` available on lowlevel Server The `streamable_http_app()` method is now available directly on the lowlevel `Server` class, not just `MCPServer`. This allows using the streamable HTTP transport without the MCPServer wrapper. ```python from mcp.server import Server, ServerRequestContext from mcp_types import ListToolsResult, PaginatedRequestParams async def handle_list_tools(ctx: ServerRequestContext, params: PaginatedRequestParams | None) -> ListToolsResult: return ListToolsResult(tools=[...]) server = Server("my-server", on_list_tools=handle_list_tools) app = server.streamable_http_app( streamable_http_path="/mcp", json_response=False, stateless_http=False, ) ``` The lowlevel `Server` also now exposes a `session_manager` property to access the `StreamableHTTPSessionManager` after calling `streamable_http_app()`. ### `ElicitationResult` is now a subscriptable generic alias `ElicitationResult` is now a `TypeAliasType` instead of a plain union, so `ElicitationResult[Confirm]` works as an annotation (resolver dependency injection consumes it that way - see [Dependencies](https://py.sdk.modelcontextprotocol.io/v2/tutorial/dependencies/index.md)). The members are unchanged: `AcceptedElicitation[T] | DeclinedElicitation | CancelledElicitation`. The one behavioral change: a runtime `isinstance(result, ElicitationResult)` now raises `TypeError`. Check against the member classes directly instead: ```python result = await ctx.elicit("Proceed?", Confirm) if isinstance(result, AcceptedElicitation): ... # result.data is a Confirm ``` Narrowing on `result.action` (`"accept"` / `"decline"` / `"cancel"`) is unaffected. ## Need Help? If you encounter issues during migration: 1. Check the [API Reference](https://py.sdk.modelcontextprotocol.io/v2/api/mcp/) for updated method signatures 2. Review the [examples](https://github.com/modelcontextprotocol/python-sdk/tree/main/examples) for updated usage patterns 3. Open an issue on [GitHub](https://github.com/modelcontextprotocol/python-sdk/issues) if you find a bug or need further assistance # Tutorial - User Guide Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/ This tutorial shows you how to use the MCP Python SDK, step by step. Each section gradually builds on the previous ones, but it's written so you can go straight to any specific section to solve a specific problem. It also works as a future reference: you can come back to exactly the part you need. ## Run the code All the code blocks can be copied and used directly: they are complete, working files. To follow along, paste a block into a `server.py` and open it in the MCP Inspector: ```console uv run mcp dev server.py ``` It is **HIGHLY encouraged** that you write (or copy) the code, edit it, and run it locally. Using it in your own editor is what really shows you the point: how little you write, the autocompletion, the type checks catching mistakes before you run anything. ## You will not be guessing Every example in this tutorial is a complete file under [`docs_src/`](https://github.com/modelcontextprotocol/python-sdk/tree/main/docs_src) in the SDK's own repository, and every one of them is exercised by the SDK's test suite through an **in-memory client**: ```python import pytest from mcp import Client from server import mcp @pytest.mark.anyio async def test_add() -> None: async with Client(mcp) as client: result = await client.call_tool("add", {"a": 1, "b": 2}) assert result.structured_content == {"result": 3} ``` No subprocess, no port, no transport. `Client(mcp)` connects to the server object directly. If a change to the SDK breaks an example on one of these pages, CI goes red before the page does. The code you read here is the code that runs. You'll use this yourself in the [Testing](https://py.sdk.modelcontextprotocol.io/v2/tutorial/testing/index.md) chapter; it's how you test your own servers, too. ## Install the SDK If you haven't yet, [install the SDK](https://py.sdk.modelcontextprotocol.io/v2/installation/index.md) first. ## Advanced User Guide There is also an **Advanced User Guide** you can read after this one. It builds on this tutorial, uses the same concepts, and teaches you the extra things: the low-level `Server`, middleware, authorization, the 2026-07-28 protocol negotiation. But you should read this first: everything in the Advanced guide assumes you know the basics. # First steps Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/first-steps/ On the landing page you wrote a server, ran it, and called a tool. Now do it again, slowly, with all three things a server can expose, and the names for everything you just saw. ## Host, client, and server Three words you'll see on every page from here on: * A **host** is the LLM application: Claude, an IDE, an agent runtime. It's the thing the user is talking to. * A **client** lives inside the host and speaks MCP. The host runs one client per server it's connected to. * A **server** is what you build with this SDK. It exposes things to clients. It never talks to the model directly. You write the server. Hosts are someone else's product. The SDK also gives you a `Client`. You'll use it to test your servers, and it shows up later in this chapter. ## The three primitives A server exposes exactly three kinds of thing. What separates them is **who decides to use them**: | Primitive | Controlled by | What it is | Example | |---------------|-----------------|-----------------------------------------------------|------------------------------------| | **Tools** | The model | A function the model calls to take an action | An API call, a database write | | **Resources** | The application | Data the host loads into the model's context | A file's contents, an API response | | **Prompts** | The user | A reusable message template the user invokes by name | A slash command, a menu entry | "Controlled by" is the whole point of the split. A tool runs because the **model** decided to call it. A resource is attached because the **application** decided the model needed it. A prompt runs because the **user** picked it. !!! info If you've built a web API you already have most of the intuition: a **resource** is a `GET` (it loads data and changes nothing) and a **tool** is a `POST` (it does work and may have side effects). A **prompt** has no HTTP analogue; it's closer to a saved query the user runs by name. ## One server, all three ```python title="server.py" hl_lines="6 12 18" # docs_src/first_steps/tutorial001.py from mcp.server import MCPServer mcp = MCPServer("Demo") @mcp.tool() def add(a: int, b: int) -> int: """Add two numbers.""" return a + b @mcp.resource("greeting://{name}") def greeting(name: str) -> str: """Greet someone by name.""" return f"Hello, {name}!" @mcp.prompt() def summarize(text: str) -> str: """Summarize a piece of text in one sentence.""" return f"Summarize the following text in one sentence:\n\n{text}" ``` Three plain functions, three decorators. Each decorator is the entire registration: * `@mcp.tool()` makes `add` a **tool**. * `@mcp.resource("greeting://{name}")` makes `greeting` a **resource template**: the `{name}` in the URI is the function's parameter. * `@mcp.prompt()` makes `summarize` a **prompt**. The string it returns becomes a user message. Everything else (the name, the description, the argument schema) the SDK reads from the function itself: its name, its docstring, its type hints. You never declared any of it separately. !!! tip The two halves of the SDK have two import paths: `from mcp import Client` and `from mcp.server import MCPServer`. There is no `from mcp import MCPServer`. ### Try it Run it with the MCP Inspector: ```console uv run mcp dev server.py ``` Open the URL it prints. The Inspector has one tab per primitive; walk through them in order. **Tools.** One entry: `add`, described as *Add two numbers.* The form has a required integer field for `a` and another for `b`. Fill them in, call it, and the result is `3`. The Inspector built that form from `a: int, b: int`. So does every other client. **Resources.** The *Resources* list is empty. `greeting` is under **Resource Templates**, because `greeting://{name}` has a parameter: there is no single resource to list until someone supplies a `name`. Give it `World` and read it: ```text Hello, World! ``` **Prompts.** One entry: `summarize`, with a single required `text` argument. Get it with some text and you receive one message with `role: user` and your rendered string as the content. That's all a prompt is: a function that builds messages. The Inspector ran your server over **stdio**, one of the transports an MCP server can speak. You don't pick one yet; **[Running your server](https://py.sdk.modelcontextprotocol.io/v2/run/index.md)** is the chapter for that. ## Capabilities You saw three tabs in the Inspector. How did it know there were three? When a client connects, the server declares its **capabilities**: which families of requests it will answer. The client uses that declaration to decide what to even ask for. You never wrote it; `MCPServer` declares it for you. Look at it yourself. The SDK's `Client` accepts the server object directly and connects to it **in memory** (no subprocess, no port): ```python import asyncio from mcp import Client from server import mcp async def main() -> None: async with Client(mcp) as client: print(client.server_capabilities.model_dump(exclude_none=True)) asyncio.run(main()) ``` ```text {'prompts': {'list_changed': False}, 'resources': {'subscribe': False, 'list_changed': False}, 'tools': {'list_changed': False}} ``` That dictionary is the server's half of the handshake: | Capability | The client may now call | |-------------|------------------------------------------------------------| | `tools` | `tools/list`, `tools/call` | | `resources` | `resources/list`, `resources/templates/list`, `resources/read` | | `prompts` | `prompts/list`, `prompts/get` | `MCPServer` serves all three primitives, so all three are always declared. Notice what isn't there. `completions` (argument autocomplete for resource templates and prompts) needs a handler you write, this server doesn't have one, so the capability is absent and a well-behaved client won't ask. That's the rule for everything optional: register the thing and the capability appears; **[Completions](https://py.sdk.modelcontextprotocol.io/v2/tutorial/completions/index.md)** proves it. !!! info `Client(mcp)` is the same in-memory client every example in this tutorial is tested with, and it's how you'll test yours. It gets a whole chapter: **[Testing](https://py.sdk.modelcontextprotocol.io/v2/tutorial/testing/index.md)**. ## What you did not write Look back over this page. You wrote three small Python functions. You did **not** write: * A JSON Schema. `a: int, b: int` *is* the schema for `add`. * A request handler. `tools/list`, `resources/read`, `prompts/get`: all served for you. * A capability declaration. `MCPServer` made it for you. * A line of protocol. The handshake, the version negotiation, the JSON-RPC framing: all of it happened inside `mcp dev` and `Client(mcp)`, and you never saw it. That ratio is the whole point of the SDK. ## Recap * A **host** is the LLM app, a **client** is its MCP-speaking half, a **server** is what you build. * Tools are **model**-controlled, resources are **application**-controlled, prompts are **user**-controlled. * One decorator per primitive: `@mcp.tool()`, `@mcp.resource(uri)`, `@mcp.prompt()`. Name, description, and schema come from the function. * A URI with a `{param}` makes a resource **template**, listed separately from concrete resources. * The server's **capabilities** are declared for you, and a client only asks for what a server declares. * `Client(mcp)` connects to the server object in memory: your test harness from day one. Each primitive now gets its own chapter, starting with the one the model drives: **[Tools](https://py.sdk.modelcontextprotocol.io/v2/tutorial/tools/index.md)**. # Tools Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/tools/ A **tool** is a function the model can call. You declare one by putting `@mcp.tool()` on a plain Python function. That's the whole API. ## Your first tool ```python title="server.py" hl_lines="6-8" # docs_src/tools/tutorial001.py from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.tool() def search_books(query: str, limit: int) -> str: """Search the catalog by title or author.""" return f"Found 3 books matching {query!r} (showing up to {limit})." ``` Look at what you wrote. There are no schemas, no JSON, no protocol, just a function. The SDK reads three things from it: * The **name** of the tool is the name of the function: `search_books`. * The **description** the model sees is the docstring: `Search the catalog by title or author.` * The **arguments** the model is allowed to pass come from the type hints: `query: str` and `limit: int`. ### The input schema From those type hints the SDK generates a JSON Schema and sends it to the client during `tools/list`: ```json { "type": "object", "properties": { "query": {"title": "Query", "type": "string"}, "limit": {"title": "Limit", "type": "integer"} }, "required": ["query", "limit"], "title": "search_booksArguments" } ``` Both arguments are in `required` because neither has a default. You'll fix that in a moment. (The `title` keys are Pydantic artifacts; the properties, their types, and `required` are the contract.) !!! tip Type hints aren't documentation here. They are **the contract**. If a client sends `"limit": "ten"`, the SDK rejects it before your function ever runs. ### What the model gets back Call the tool with `{"query": "dune", "limit": 5}` and the result has two parts: ```python result.content # [TextContent(text="Found 3 books matching 'dune' (showing up to 5).")] result.structured_content # {'result': "Found 3 books matching 'dune' (showing up to 5)."} ``` `content` is the text the **model** reads. `structured_content` is typed data for the **client application**. It's there because you declared the return type as `-> str`. Don't worry about `structured_content` yet. Return real Python objects from your tools and the right thing happens; the **[Structured Output](https://py.sdk.modelcontextprotocol.io/v2/tutorial/structured-output/index.md)** chapter is all about it. ### Try it Run the server with the MCP Inspector: ```console uv run mcp dev server.py ``` Open the URL it prints, go to the **Tools** tab, and call `search_books`. The Inspector renders a form with a required `query` text field and a required `limit` number field. It built that form from your type hints. So will every other MCP client. ## Optional arguments Give a parameter a default value and it stops being required. That's it. It's just Python. ```python title="server.py" hl_lines="7" # docs_src/tools/tutorial002.py from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.tool() def search_books(query: str, limit: int = 10) -> str: """Search the catalog by title or author.""" return f"Found 3 books matching {query!r} (showing up to {limit})." ``` The schema follows: ```json { "type": "object", "properties": { "query": {"title": "Query", "type": "string"}, "limit": {"default": 10, "title": "Limit", "type": "integer"} }, "required": ["query"], "title": "search_booksArguments" } ``` `limit` left `required` and gained `"default": 10`. A client that omits it gets `10`, exactly as Python would. ## Richer schemas with `Field` Type hints get you a long way, but sometimes you want to *describe* an argument, or constrain it. Wrap the type in `Annotated` and add a Pydantic `Field`: ```python title="server.py" hl_lines="12-14" # docs_src/tools/tutorial003.py from typing import Annotated, Literal from pydantic import Field from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.tool() def search_books( query: Annotated[str, Field(description="Title or author to search for.")], limit: Annotated[int, Field(ge=1, le=50, description="Maximum number of results.")] = 10, genre: Literal["fiction", "non-fiction", "poetry"] | None = None, ) -> str: """Search the catalog by title or author.""" where = f" in {genre}" if genre else "" return f"Found 3 books matching {query!r}{where} (showing up to {limit})." ``` Three new things, all on the parameters: * `Field(description=...)`: a per-argument description the model reads alongside the docstring. * `Field(ge=1, le=50)`: numeric bounds. They land in the schema as `"minimum": 1, "maximum": 50`. * `Literal["fiction", "non-fiction", "poetry"]`: an enum. The model can only pick one of those. !!! check Constraints are not decoration. Call the tool with `limit=999` and the SDK answers with a tool error **before your function runs**: ```text Input should be less than or equal to 50 ``` That error goes back to the model as the tool result, and the model reads it and retries with a valid value. You wrote `le=50` once and got self-correcting agents for free. !!! info If you've used FastAPI or Pydantic, you already know all of this. It's the same `Field`, the same `Annotated`, the same validation. There is nothing MCP-specific to learn here. ## A model as a parameter When a tool takes more than a couple of arguments, group them into a Pydantic model: ```python title="server.py" hl_lines="8-11 15" # docs_src/tools/tutorial004.py from pydantic import BaseModel, Field from mcp.server import MCPServer mcp = MCPServer("Bookshop") class Book(BaseModel): title: str author: str year: int = Field(ge=1450, description="Year of first publication.") @mcp.tool() def add_book(book: Book) -> str: """Add a book to the catalog.""" return f"Added {book.title!r} by {book.author} ({book.year})." ``` The `Book` schema is nested inside the tool's input schema (as a `$defs` reference), the model fills it in as a JSON object, and your function receives a **real `Book` instance**, already validated, with `.title`, `.author` and `.year` attributes. You can mix and match: plain parameters next to model parameters, nested models, lists of models. It's Pydantic all the way down. ## `async def` If a tool does I/O (calls an API, reads a file, queries a database), declare it `async def` and `await` inside it. The SDK awaits it. A plain `def` tool works too: the SDK runs it in a thread so it never blocks the server. There is nothing else to configure. ## Names, titles, and annotations Everything the SDK infers, you can override in the decorator: ```python title="server.py" hl_lines="8-11" # docs_src/tools/tutorial005.py from mcp_types import ToolAnnotations from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.tool( title="Search the catalog", annotations=ToolAnnotations(read_only_hint=True, open_world_hint=False), ) def search_books(query: str) -> str: """Search the catalog by title or author.""" return f"Found 3 books matching {query!r}." ``` * `title` is a human-readable name for UIs. Clients show *"Search the catalog"* instead of `search_books`. * `annotations` are behavioural **hints** for the client: * `read_only_hint=True`: this tool doesn't change anything. * `open_world_hint=False`: it works on a closed set of things (this catalog), not the open web. * The other two, `destructive_hint` and `idempotent_hint`, describe a tool that *writes*: may it delete something, and is calling it twice the same as calling it once? The spec defines both only for non-read-only tools, so they would say nothing on `search_books`. A well-behaved client uses them to decide things like *"do I need to ask the user before running this?"*. They are hints, not security. Never rely on a client honouring them. !!! tip `name=` and `description=` are also accepted by `@mcp.tool()` if you don't want to derive them from the function name and docstring. Most of the time you do. ## Recap * `@mcp.tool()` on a function makes it a tool. Name from the function, description from the docstring. * Type hints **are** the input schema. Defaults make arguments optional. * `Annotated[..., Field(...)]` adds descriptions and constraints; `Literal` adds enums. * A Pydantic model parameter is how you take a structured "body". * Bad arguments are rejected for you, with an error the model can read and recover from. * `async def` for I/O, plain `def` for everything else. Next up, **[Structured Output](https://py.sdk.modelcontextprotocol.io/v2/tutorial/structured-output/index.md)**: what happens to the value you `return`. # Structured Output Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/structured-output/ In **[Tools](https://py.sdk.modelcontextprotocol.io/v2/tutorial/tools/index.md)** you returned a `str` and the result came back twice: as text in `content`, and as `{"result": "..."}` in `structured_content`. This chapter is about that second channel: where it comes from, every shape it can take, and how the SDK keeps it honest. The short version: **the return type annotation is the output schema**. You already wrote it. ## The output schema ```python title="server.py" hl_lines="9" # docs_src/structured_output/tutorial001.py from mcp.server import MCPServer mcp = MCPServer("Weather") READINGS = {"London": 17, "Cairo": 34, "Reykjavik": 4} @mcp.tool() def get_temperature(city: str) -> int: """Current temperature in a city, in whole degrees Celsius.""" return READINGS[city] ``` The line that matters is the signature: `-> int`. Because of it, the tool the SDK sends during `tools/list` carries an `output_schema` next to the input schema you met in **[Tools](https://py.sdk.modelcontextprotocol.io/v2/tutorial/tools/index.md)**: ```json { "properties": { "result": {"title": "Result", "type": "integer"} }, "required": ["result"], "title": "get_temperatureOutput", "type": "object" } ``` A bare `int` isn't a JSON object, so the SDK **wraps** it in `{"result": ...}`. Call the tool and both channels are filled: ```python result.content # [TextContent(text="17")] result.structured_content # {"result": 17} ``` Every scalar gets the same wrapper: `str`, `int`, `float`, `bool`, `bytes`, `None`. ## Two channels Why send the same value twice? * `content` is for the **model**. A language model reads text; this is the only part of the result it sees. * `structured_content` is for the **application** the model runs inside: code that wants `17`, not a sentence containing "17". * `output_schema` is the contract between them, published before the tool is ever called. You return one Python value. The SDK fills in all three. ## Return a model Declare the shape as a Pydantic `BaseModel` and return an instance: ```python title="server.py" hl_lines="8-11 15" # docs_src/structured_output/tutorial002.py from pydantic import BaseModel, Field from mcp.server import MCPServer mcp = MCPServer("Weather") class WeatherData(BaseModel): temperature: float = Field(description="Degrees Celsius.") humidity: float = Field(description="Relative humidity, 0 to 1.") conditions: str @mcp.tool() def get_weather(city: str) -> WeatherData: """Current weather for a city.""" return WeatherData(temperature=16.2, humidity=0.83, conditions="Overcast") ``` `WeatherData` **is** the schema now. No wrapper, no `result` key: ```json { "properties": { "temperature": {"description": "Degrees Celsius.", "title": "Temperature", "type": "number"}, "humidity": {"description": "Relative humidity, 0 to 1.", "title": "Humidity", "type": "number"}, "conditions": {"title": "Conditions", "type": "string"} }, "required": ["temperature", "humidity", "conditions"], "title": "WeatherData", "type": "object" } ``` `structured_content` is the object, field for field: ```python result.structured_content # {"temperature": 16.2, "humidity": 0.83, "conditions": "Overcast"} ``` And the model is not left out. The SDK serializes the same object to JSON text for `content`: ```json { "temperature": 16.2, "humidity": 0.83, "conditions": "Overcast" } ``` Notice the `Field(description=...)` on `temperature` and `humidity` landed in the schema. The same `Field` that described your **inputs** describes your outputs. !!! info If you've used FastAPI's `response_model`, you already know this: a Pydantic model as the declared response, serialized and documented for you. The only difference is that here the return annotation is the whole declaration. ## A `TypedDict` Not every shape deserves a class. A `TypedDict` produces the same schema: ```python title="server.py" hl_lines="8" # docs_src/structured_output/tutorial003.py from typing import TypedDict from mcp.server import MCPServer mcp = MCPServer("Weather") class WeatherData(TypedDict): temperature: float humidity: float conditions: str @mcp.tool() def get_weather(city: str) -> WeatherData: """Current weather for a city.""" return WeatherData(temperature=16.2, humidity=0.83, conditions="Overcast") ``` A `TypedDict` is a plain `dict` at runtime, so that is what you build and return. The schema, the validation, and `structured_content` are identical to the `BaseModel` version (minus the descriptions, which `TypedDict` has no place for). ## A dataclass Dataclasses work too, and so does any ordinary class whose attributes have type hints. The SDK builds a Pydantic model out of the annotations behind the scenes. ```python title="server.py" hl_lines="8-9" # docs_src/structured_output/tutorial004.py from dataclasses import dataclass from mcp.server import MCPServer mcp = MCPServer("Weather") @dataclass class WeatherData: temperature: float humidity: float conditions: str @mcp.tool() def get_weather(city: str) -> WeatherData: """Current weather for a city.""" return WeatherData(temperature=16.2, humidity=0.83, conditions="Overcast") ``` Three spellings, one schema. Use whichever your codebase already has. ## Lists A `list[...]` isn't a JSON object either, so it gets the `{"result": ...}` wrapper, with your item type as a `$defs` reference inside it: ```python title="server.py" hl_lines="15" # docs_src/structured_output/tutorial005.py from pydantic import BaseModel from mcp.server import MCPServer mcp = MCPServer("Weather") class WeatherData(BaseModel): temperature: float humidity: float conditions: str @mcp.tool() def get_forecast(city: str, days: int) -> list[WeatherData]: """Daily forecast for a city.""" return [WeatherData(temperature=16.2 + day, humidity=0.83, conditions="Overcast") for day in range(days)] ``` ```json { "$defs": { "WeatherData": { "properties": { "temperature": {"title": "Temperature", "type": "number"}, "humidity": {"title": "Humidity", "type": "number"}, "conditions": {"title": "Conditions", "type": "string"} }, "required": ["temperature", "humidity", "conditions"], "title": "WeatherData", "type": "object" } }, "properties": { "result": {"items": {"$ref": "#/$defs/WeatherData"}, "title": "Result", "type": "array"} }, "required": ["result"], "title": "get_forecastOutput", "type": "object" } ``` Ask for a two-day forecast and `structured_content` is `{"result": [{...}, {...}]}`. `content` becomes **two** `TextContent` blocks, one per item: a list is flattened for the model rather than dumped as one string. `tuple[...]`, unions, and `Optional[...]` are wrapped the same way. ## Dictionaries `dict[str, ...]` is the one generic that already *is* a JSON object, so it isn't wrapped: ```python title="server.py" hl_lines="9" # docs_src/structured_output/tutorial006.py from mcp.server import MCPServer mcp = MCPServer("Weather") READINGS = {"London": 16.2, "Cairo": 34.1, "Reykjavik": 4.4} @mcp.tool() def get_temperatures(cities: list[str]) -> dict[str, float]: """Current temperature for each city, in degrees Celsius.""" return {city: READINGS[city] for city in cities} ``` ```json { "additionalProperties": {"type": "number"}, "title": "get_temperaturesDictOutput", "type": "object" } ``` ```python result.structured_content # {"London": 16.2, "Reykjavik": 4.4} ``` The keys must be `str`. A `dict[int, float]` can't be a JSON object, so it falls back to the `{"result": ...}` wrapper. ## Validation `output_schema` is not documentation. Whatever your function returns is **validated against it** before it leaves the server. You don't notice while you build the value by hand: Pydantic already made sure your `WeatherData` was a `WeatherData`. You notice the day the data comes from somewhere you don't control: ```python title="server.py" hl_lines="9 21" # docs_src/structured_output/tutorial007.py import json from pydantic import BaseModel from mcp.server import MCPServer mcp = MCPServer("Weather") UPSTREAM = {"London": '{"temperature": 16.2, "conditions": "Overcast"}'} class WeatherData(BaseModel): temperature: float humidity: float conditions: str @mcp.tool() def get_weather(city: str) -> WeatherData: """Current weather for a city.""" return json.loads(UPSTREAM[city]) ``` The annotation promises `WeatherData`. The upstream response stopped sending `humidity`. !!! check Call `get_weather` and it does not quietly hand the client a half-empty object. The call fails, and the first lines of the error name the field: ```text Error executing tool get_weather: 1 validation error for WeatherData humidity Field required [type=missing, input_value={'temperature': 16.2, 'conditions': 'Overcast'}, input_type=dict] ``` That text comes back as the tool result with `is_error=True`, so the model knows the call failed instead of confidently reading weather that isn't there. Returning a plain `dict` from a `-> WeatherData` tool is fine, by the way. That's exactly what `json.loads` produced. Validation is on the value, not on the Python type. ## Opting out Sometimes the return annotation is for your type checker, not for the protocol. Pass `structured_output=False` and the tool is text-only: ```python title="server.py" hl_lines="6" # docs_src/structured_output/tutorial008.py from mcp.server import MCPServer mcp = MCPServer("Weather") @mcp.tool(structured_output=False) def weather_report(city: str) -> str: """A human-readable weather report for a city.""" return f"{city}: 17 degrees, overcast, light rain easing by evening." ``` No `output_schema`, no wrapping, no validation. `structured_content` is `None` and `content` is the string you returned. The opposite, `structured_output=True`, turns the automatic detection into a requirement: a tool whose return type can't produce a schema raises at import time instead of falling back to text. ## A class without type hints There is one way to end up unstructured without asking for it: return a class that has **no annotations on its body**. ```python title="server.py" hl_lines="6-9" # docs_src/structured_output/tutorial009.py from mcp.server import MCPServer mcp = MCPServer("Weather") class Station: def __init__(self, name: str, online: bool): self.name = name self.online = online @mcp.tool() def get_station(name: str) -> Station: """Look up a weather station by name.""" return Station(name=name, online=True) ``` `Station` sets `name` and `online` inside `__init__`, but the *class* declares nothing. The SDK reads class annotations, finds none, and gives up. !!! warning It gives up **silently**. `output_schema` is `None`, `structured_content` is `None`, and the text the model reads is the object's `repr`: ```text "" ``` No error, no warning, a useless tool. Move the annotations onto the class body, or pass `structured_output=True`, which turns this into a hard error the moment the module imports: `Function get_station: return type is not serializable for structured output`. !!! tip Need full control (building the `CallToolResult` yourself, or attaching `_meta` that the application can see but the model can't)? That's **[The low-level Server](https://py.sdk.modelcontextprotocol.io/v2/advanced/low-level-server/index.md)**. ## Recap * The **return type annotation** is the output schema. It's published in `tools/list` as `output_schema`. * Scalars, lists, tuples and unions are wrapped in `{"result": ...}`. Models, `TypedDict`s, dataclasses, annotated classes and `dict[str, ...]` are objects already and stay as they are. * Every result carries `content` (text, for the model) **and** `structured_content` (data, for the application). * What you return is validated against the schema. A mismatch is a tool error, not a corrupt result. * `structured_output=False` opts a tool out. A class without type hints opts out silently; watch for it. You now own everything a tool can say back. Next, the second primitive: **[Resources](https://py.sdk.modelcontextprotocol.io/v2/tutorial/resources/index.md)**. # Resources Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/resources/ A **resource** is data you expose for the application to read. That's the split. A tool is something the **model** decides to call. A resource is something the **application** decides to load (a config file, a record, a document) and put in front of the model as context. You declare one by putting `@mcp.resource(uri)` on a plain Python function. ## Your first resource ```python title="server.py" hl_lines="6-8" # docs_src/resources/tutorial001.py from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.resource("config://app") def get_config() -> str: """The active shop configuration.""" return "theme=dark\nlanguage=en" ``` It's the same shape as a tool, plus one thing: the **URI**. Resources are addressed, not named. A client asks for `config://app`, never for `get_config`. The SDK still reads the rest from the function: * The **name** is the function name: `get_config`. * The **description** the client sees is the docstring. * The **content** is whatever you return. During `resources/list` the client gets this: ```json { "name": "get_config", "uri": "config://app", "description": "The active shop configuration.", "mimeType": "text/plain" } ``` And when it reads `config://app`, your function runs and the return value comes back as text: ```python result.contents # [TextResourceContents(uri="config://app", mime_type="text/plain", text="theme=dark\nlanguage=en")] ``` !!! tip Listing is cheap. Your function is **not** called during `resources/list`, only during `resources/read`, and only for the URI that was asked for. Expose a thousand resources and you pay for the ones somebody opens. ### Try it Run the server with the MCP Inspector: ```console uv run mcp dev server.py ``` Open the URL it prints and go to the **Resources** tab. `config://app` is in the list with its description. Click it and the Inspector reads it: there are your two lines of config. ## Resource templates One URI per record doesn't scale. Put a **placeholder** in the URI and a matching parameter on the function: ```python title="server.py" hl_lines="12-13" # docs_src/resources/tutorial002.py from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.resource("config://app") def get_config() -> str: """The active shop configuration.""" return "theme=dark\nlanguage=en" @mcp.resource("users://{user_id}/profile") def get_user_profile(user_id: str) -> str: """A customer's profile.""" return f"User {user_id}: 12 orders since 2021." ``` `{user_id}` in the URI, `user_id: str` on the function. That is the entire contract. This is now a **resource template**, and it moves house: it leaves `resources/list` and shows up in `resources/templates/list` instead, as a pattern rather than an address: ```json { "name": "get_user_profile", "uriTemplate": "users://{user_id}/profile", "description": "A customer's profile.", "mimeType": "text/plain" } ``` The client fills in the placeholder and reads a concrete URI: `users://42/profile`, `users://ada/profile`. One function answers all of them, with the matched value passed in as `user_id`: ```python result.contents # [TextResourceContents(uri="users://42/profile", text="User 42: 12 orders since 2021.")] ``` Notice the `uri` in the result. It is the **concrete** URI the client asked for, not the template. !!! check The placeholders and the parameters have to agree. Rename the function parameter to `user` while the URI still says `{user_id}` and the decorator refuses **at import time**, before any client gets near it: ```text ValueError: Mismatch between URI parameters {'user_id'} and function parameters {'user'} ``` A mismatch can only ever be a bug, so the SDK makes it impossible to start the server with one. The placeholder syntax is [RFC 6570](https://datatracker.ietf.org/doc/html/rfc6570): `{+path}` for multi-segment values, `{?q,lang}` for optional query parameters, and more. The SDK also applies path-safety checks to extracted values by default. See **[URI templates and path safety](https://py.sdk.modelcontextprotocol.io/v2/advanced/uri-templates/index.md)** for the full reference. `get_user_profile` can also take a parameter annotated `Context`. The SDK injects it without ever treating it as a URI parameter, and **[The Context](https://py.sdk.modelcontextprotocol.io/v2/tutorial/context/index.md)** chapter covers what it gives you. ## What you return You're not limited to `str`. Give each resource a `mime_type` and return whatever fits: ```python title="server.py" hl_lines="8-9 14-15 20-21" # docs_src/resources/tutorial003.py import base64 from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.resource("docs://readme", mime_type="text/markdown") def readme() -> str: """How to use this server.""" return "# Bookshop\n\nSearch the catalog with the `search_books` tool." @mcp.resource("stats://catalog", mime_type="application/json") def catalog_stats() -> dict[str, int]: """Live counts for the catalog.""" return {"books": 1204, "authors": 391} @mcp.resource("covers://placeholder", mime_type="image/gif") def placeholder_cover() -> bytes: """A 1x1 transparent GIF, shown when a book has no cover.""" return base64.b64decode("R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7") ``` * `readme` returns a `str`, so it's sent as-is. This is the common case. * `catalog_stats` returns a `dict`, so the SDK serialises it to **JSON text** for you: ```json { "books": 1204, "authors": 391 } ``` * `placeholder_cover` returns `bytes`, so the client gets a `BlobResourceContents` instead of a `TextResourceContents`, with your bytes base64-encoded in its `blob` field. The same rule applies to anything else JSON-serialisable: a list, a Pydantic model, a dataclass. If it isn't a `str` and isn't `bytes`, it becomes JSON. `mime_type` is yours to declare, and it defaults to `text/plain`. The SDK never inspects what you return to guess it, so a `dict` resource you don't label is still advertised as plain text. !!! tip `name=`, `title=` and `description=` are also accepted by `@mcp.resource()` when you don't want to derive them from the function. And when there's no function to write at all, `mcp.server.mcpserver.resources` has ready-made `Resource` classes (`TextResource`, `BinaryResource`, `FileResource`, `HttpResource`, `DirectoryResource`) that you register with `mcp.add_resource(...)`. A client can also **subscribe** to a resource and be notified when it changes; that's the client's half of the story and it lives in **[The Client](https://py.sdk.modelcontextprotocol.io/v2/client/index.md)**. ## Recap * `@mcp.resource(uri)` on a function makes it a resource. The URI is the address, the return value is the content, the docstring is the description. * A `{placeholder}` in the URI makes it a **template**: it's listed under `resources/templates/list` and one function serves every URI that matches. * Placeholder names must equal the function's parameter names. Get it wrong and you find out at import time, not in production. * Your function runs when the resource is **read**, not when it's listed. * `str` becomes text, `bytes` becomes a base64 blob, anything else becomes JSON text. `mime_type=` is how you label it. * Tools are for the model to act. Resources are for the application to read. Next: the third primitive, the one a person picks from a menu, **[Prompts](https://py.sdk.modelcontextprotocol.io/v2/tutorial/prompts/index.md)**. # Prompts Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/prompts/ A **prompt** is a message template the user picks. Tools are for the model. A prompt is the opposite: the user chooses one from a menu in their client (a slash command, a button), fills in its arguments, and the rendered messages go into the conversation as if they had typed them. You declare one by putting `@mcp.prompt()` on a function that returns the text. ## Your first prompt ```python title="server.py" hl_lines="6-9" # docs_src/prompts/tutorial001.py from mcp.server import MCPServer mcp = MCPServer("Code Helper") @mcp.prompt() def review_code(code: str) -> str: """Review a piece of code.""" return f"Please review this code:\n\n{code}" ``` The SDK reads the same three things it read from your tools: * The **name** is the function name: `review_code`. * The **description** the client shows is the docstring: `Review a piece of code.` * The **arguments** come from the parameters. `code` has no default, so it's required. That is what a client gets back from `prompts/list`: ```json { "name": "review_code", "description": "Review a piece of code.", "arguments": [ {"name": "code", "required": true} ] } ``` There is no JSON Schema here. Prompt arguments are a flat list of **named string values**: a form a person fills in, not a payload a model constructs. ### Rendering it The client renders the template with `prompts/get`, passing the arguments. Your function runs and the `str` you return becomes **one user message**: ```json { "description": "Review a piece of code.", "messages": [ { "role": "user", "content": { "type": "text", "text": "Please review this code:\n\ndef add(a, b): return a + b" } } ], "resultType": "complete" } ``` That is the entire life of a prompt: listed by name, rendered on demand, dropped into the chat. !!! check `required` is enforced before your function runs. Render `review_code` without `code` and the request itself fails with a JSON-RPC error (code `-32603`): ```text mcp.shared.exceptions.MCPError: Internal server error ``` There is no tool-style error result to hand back to a model, because no model is in the loop: the call raises. The reason (`Missing required arguments: {'code'}`) lands in your server's log. ### Try it Run the server with the MCP Inspector: ```console uv run mcp dev server.py ``` Open the **Prompts** tab and select `review_code`. The Inspector draws a form with one required `code` field. Fill it in, render it, and you get back exactly the user message above. ## More than one message A code review is one message. A debugging session is a conversation, and a prompt can seed the whole thing. Return a list of messages instead of a `str`: ```python title="server.py" hl_lines="2 13-20" # docs_src/prompts/tutorial002.py from mcp.server import MCPServer from mcp.server.mcpserver.prompts.base import AssistantMessage, Message, UserMessage mcp = MCPServer("Code Helper") @mcp.prompt() def review_code(code: str) -> str: """Review a piece of code.""" return f"Please review this code:\n\n{code}" @mcp.prompt() def debug_error(error: str) -> list[Message]: """Start a debugging conversation.""" return [ UserMessage("I'm seeing this error:"), UserMessage(error), AssistantMessage("I'll help debug that. What have you tried so far?"), ] ``` * `UserMessage` and `AssistantMessage` come from `mcp.server.mcpserver.prompts.base`. Hand them a `str` and they wrap it in `TextContent` for you. The role is the class name. * `Message` is their common base. Use it as the return annotation. Rendering `debug_error` now produces three messages, in order: ```json { "description": "Start a debugging conversation.", "messages": [ {"role": "user", "content": {"type": "text", "text": "I'm seeing this error:"}}, {"role": "user", "content": {"type": "text", "text": "TypeError: 'int' object is not iterable"}}, { "role": "assistant", "content": {"type": "text", "text": "I'll help debug that. What have you tried so far?"} } ], "resultType": "complete" } ``` Notice the last one. Pre-filling an `assistant` turn is how you steer the model's *next* reply without making the user type the steering themselves. ## Titles and argument descriptions `review_code` is a function name, not a label. Give the client something better to put on the button, and describe each argument so the form explains itself: ```python title="server.py" hl_lines="10-13" # docs_src/prompts/tutorial003.py from typing import Annotated from pydantic import Field from mcp.server import MCPServer mcp = MCPServer("Code Helper") @mcp.prompt(title="Code review") def review_code( code: Annotated[str, Field(description="The code to review.")], language: Annotated[str, Field(description="The language the code is written in.")] = "python", ) -> str: """Review a piece of code.""" return f"Please review this {language} code:\n\n{code}" ``` * `title="Code review"` is the human-readable name, exactly like a tool's `title`. * `Annotated[str, Field(description=...)]` is the same pattern you used in **[Tools](https://py.sdk.modelcontextprotocol.io/v2/tutorial/tools/index.md)**. Here the description lands on the argument instead of in a schema. * `language` has a default, so it stops being required. The `prompts/list` entry now carries everything a client needs to draw a good form: ```json { "name": "review_code", "title": "Code review", "description": "Review a piece of code.", "arguments": [ {"name": "code", "description": "The code to review.", "required": true}, {"name": "language", "description": "The language the code is written in.", "required": false} ] } ``` !!! info If you have read **[Tools](https://py.sdk.modelcontextprotocol.io/v2/tutorial/tools/index.md)**, you already know everything on this page. Same decorator, same docstring-as-description, same `Annotated`/`Field`. The only things that change are who triggers it (the user) and where the result goes (into the conversation). ## Recap * `@mcp.prompt()` on a function makes it a prompt. Name from the function, description from the docstring. * Prompts are **user-controlled**: the client lists them, the user picks one and fills in the arguments. * Arguments are a flat list of named strings (no schema). A parameter with a default is optional. * Return a `str` and it becomes one user message. Return a list of `UserMessage` / `AssistantMessage` to seed a multi-turn conversation. * `title=` and `Field(description=...)` are what a client puts in its UI. * A missing required argument fails the whole request. There is no per-prompt error result. Next up: the one extra parameter a tool, resource or prompt can ask the SDK for, **[The Context](https://py.sdk.modelcontextprotocol.io/v2/tutorial/context/index.md)**. # The Context Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/context/ A tool's arguments come from the model. Everything else (the request you are serving, the server you live in, a way to talk back to the client) comes from one object: the **`Context`**. You don't construct it and you don't configure it. You ask for it. ## Ask for it Add a parameter annotated with `Context` to any tool: ```python title="server.py" hl_lines="2 8" # docs_src/context/tutorial001.py from mcp.server import MCPServer from mcp.server.mcpserver import Context mcp = MCPServer("Bookshop") @mcp.tool() def search_books(query: str, ctx: Context) -> str: """Search the catalog by title or author.""" return f"[request {ctx.request_id}] Found 3 books matching {query!r}." ``` * The SDK builds a fresh `Context` for every request and passes it in. * The parameter **name doesn't matter**. `ctx`, `context`, `c`: the SDK finds it by its annotation. * Resources and prompts can declare one too, the same way. * `ctx.request_id` is the id of the request your function is serving right now. !!! info If you've used FastAPI, you've seen this move: declare a parameter with the framework's own type (`Request` there, `Context` here) and the framework supplies it. Nothing to register, nothing to configure: the type annotation is the whole mechanism. ### Invisible to the model This is the part to internalise. Here is the input schema `tools/list` reports for `search_books`: ```json { "type": "object", "properties": { "query": {"title": "Query", "type": "string"} }, "required": ["query"], "title": "search_booksArguments" } ``` One property. `ctx` is not an argument: it never appears in the schema, the model is never told about it, and no client can fill it in. It's a contract between you and the SDK, invisible on the wire. ### Try it Run the server with the MCP Inspector: ```console uv run mcp dev server.py ``` The form for `search_books` has a single `query` field. Call it with `dune`: ```text [request 3] Found 3 books matching 'dune'. ``` The number is whichever request this happened to be. Call the tool again and it changes: every request gets its own `Context`. ## What it gives you The injected object is small. Besides `request_id`: * `await ctx.read_resource(uri)`: read one of the server's **own** resources from inside a tool. The next section. * `await ctx.report_progress(progress, total, message)`: stream progress back to the caller during a long call. The whole story is in **[Progress](https://py.sdk.modelcontextprotocol.io/v2/tutorial/progress/index.md)**. * `await ctx.elicit(message, schema)` and `await ctx.elicit_url(...)`: pause the tool and ask the user a question. That's **[Elicitation](https://py.sdk.modelcontextprotocol.io/v2/tutorial/elicitation/index.md)**. * `ctx.session`: the server's side of the conversation with this client. Notifications you send to the client live here; the last section uses it. * `ctx.headers`: the request headers the transport carried, or `None` on stdio. Read a custom header with `(ctx.headers or {}).get("x-...")`. Headers are client-supplied input - fine for a locale or a feature flag, never an identity. * `ctx.request_context`: the raw per-request record. The field you'll reach for is `lifespan_context`, the object your startup code yielded (see **[Lifespan](https://py.sdk.modelcontextprotocol.io/v2/tutorial/lifespan/index.md)**). Logging is deliberately not on that list. A server logs with Python's `logging` module, like any other Python program. **[Logging](https://py.sdk.modelcontextprotocol.io/v2/tutorial/logging/index.md)** is the short chapter on why. !!! tip Injection only happens for the function you registered. A helper that your tool calls doesn't get its own `Context`; pass `ctx` down as an ordinary argument. There is no ambient "current context" to fetch from somewhere else. ## Read your own resources A server's resources aren't only for clients. A tool can read them too: ```python title="server.py" hl_lines="16" # docs_src/context/tutorial002.py from mcp.server import MCPServer from mcp.server.mcpserver import Context mcp = MCPServer("Bookshop") @mcp.resource("catalog://genres") def genres() -> str: """The genres the catalog is organised into.""" return "fiction, non-fiction, poetry" @mcp.tool() async def describe_catalog(ctx: Context) -> str: """Describe how the catalog is organised.""" [contents] = await ctx.read_resource("catalog://genres") return f"The catalog is organised into: {contents.content}" ``` `ctx.read_resource` resolves the URI through the same registry that serves `resources/read`, so a tool gets what a client would get: an iterable of `ReadResourceContents`, one per content block. For this URI there is one: ```python contents.content # 'fiction, non-fiction, poetry' contents.mime_type # 'text/plain' ``` * `content` is exactly what `genres()` returned. One source of truth: the client browses the resource, your tools consume it, nobody copies the string. * `describe_catalog`'s only parameter is the `Context`, so its input schema has **no properties at all**. The model calls it with `{}`. ## Tell the client the list changed What a server offers is not fixed at import time. Register a tool at runtime, then tell the client: ```python title="server.py" hl_lines="15-16" # docs_src/context/tutorial003.py from mcp.server import MCPServer from mcp.server.mcpserver import Context mcp = MCPServer("Bookshop") def recommend_book(genre: str) -> str: """Recommend a book in the given genre.""" return f"In {genre}, try 'Dune'." @mcp.tool() async def enable_recommendations(ctx: Context) -> str: """Switch on the recommendation tool.""" mcp.add_tool(recommend_book) await ctx.session.send_tool_list_changed() return "Recommendations are now available." ``` * `mcp.add_tool(recommend_book)` registers a plain function as a tool: name, description and schema derived exactly as `@mcp.tool()` would have. * `await ctx.session.send_tool_list_changed()` sends `notifications/tools/list_changed`. A client that receives it calls `tools/list` again and sees `recommend_book`. The siblings are `send_resource_list_changed()`, `send_prompt_list_changed()`, and `send_resource_updated(uri)` for a change to one specific resource. !!! check Before anyone runs `enable_recommendations`, the tool you are promising does not exist. Call it anyway and the result is an error the model can read: ```text Unknown tool: recommend_book ``` Run `enable_recommendations`, and the very same call succeeds. The tool list is genuinely dynamic: `tools/list` reflects whatever is registered *right now*. ## Recap * Annotate a parameter with `Context` (in a tool, a resource, or a prompt) and the SDK injects it. The name is yours. * It is invisible to the model: the input schema only ever contains your real arguments. * `ctx.request_id` identifies the request; `ctx.request_context.lifespan_context` is what your startup yielded. * `await ctx.read_resource(uri)` lets a tool read the server's own resources. * `ctx.session` is the channel back to the client: `send_tool_list_changed()` and its siblings tell it to re-fetch a list you changed. * Progress reporting and elicitation also start at `Context`; each has its own chapter. Next: parameters the model never sees, filled by your own functions, in **[Dependencies](https://py.sdk.modelcontextprotocol.io/v2/tutorial/dependencies/index.md)**. # Dependencies Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/dependencies/ A tool's arguments come from the model. Some values never should: a price looked up from your records, a confirmation only a person can give, anything the model could get wrong by inventing it. **Dependencies** are parameters filled by your own functions. You annotate the parameter, name the function, and the SDK calls it before your tool runs. ## Declare one Wrap the parameter's type in `Annotated[...]` and add `Resolve(fn)`: ```python title="server.py" hl_lines="18-19 23" # docs_src/dependencies/tutorial001.py from typing import Annotated from pydantic import BaseModel from mcp.server import MCPServer from mcp.server.mcpserver import Resolve mcp = MCPServer("Bookshop") INVENTORY = {"Dune": 7, "Neuromancer": 0} class Stock(BaseModel): title: str copies: int async def check_stock(title: str) -> Stock: return Stock(title=title, copies=INVENTORY.get(title, 0)) @mcp.tool() async def reserve_book(title: str, stock: Annotated[Stock, Resolve(check_stock)]) -> str: """Reserve a copy of a book.""" if stock.copies == 0: return f"{title!r} is out of stock." return f"Reserved {title!r} ({stock.copies - 1} copies left)." ``` * `check_stock` is a **resolver**: a plain function the SDK runs before `reserve_book`, whose return value becomes the `stock` argument. * Its `title` parameter is the tool's own `title` argument, matched **by name**. The resolver sees exactly the validated value the tool body will see. * The tool body starts from a `Stock` that already exists. No lookup code in the tool, no "what if it's missing" preamble. !!! info If you've used FastAPI, this is `Depends`. Same move, same reason: the function declares what it needs, the framework supplies it, and the wiring lives in the type annotation. ### Invisible to the model Here is the input schema `tools/list` reports for `reserve_book`: ```json { "type": "object", "properties": { "title": {"title": "Title", "type": "string"} }, "required": ["title"], "title": "reserve_bookArguments" } ``` One property. Like the `Context` in **[The Context](https://py.sdk.modelcontextprotocol.io/v2/tutorial/context/index.md)**, a resolved parameter is a contract between you and the SDK: `stock` is not in the schema, the model is never told about it, and a client that sends a `stock` value anyway is ignored. The resolver's value is the only one your tool can receive. That last part is the point. A parameter the model cannot supply is a parameter the model cannot get wrong. ### Try it Run the server with the MCP Inspector: ```console uv run mcp dev server.py ``` The form for `reserve_book` has a single `title` field. `stock` is nowhere on it. Call it with `Dune`: ```text Reserved 'Dune' (6 copies left). ``` The tool body never looked anything up: `check_stock` ran first, and the `Stock` it returned arrived as an argument. Try `Neuromancer` and the same resolver hands the tool a zero. !!! tip You could just call `check_stock(title)` in the tool body. Declare it as a dependency when the value deserves more than a helper call: every tool that needs stock declares the same parameter, and the SDK runs the resolver at most once per call, no matter how many declare it. The next sections add the rest: resolvers that depend on each other, and resolvers that ask the user. ## Dependencies of dependencies A resolver can declare its own dependencies, with the same annotation: ```python title="server.py" hl_lines="22 29-30" # docs_src/dependencies/tutorial002.py from typing import Annotated from pydantic import BaseModel from mcp.server import MCPServer from mcp.server.mcpserver import Resolve mcp = MCPServer("Bookshop") INVENTORY = {"Dune": 7, "Neuromancer": 0} class Stock(BaseModel): title: str copies: int async def check_stock(title: str) -> Stock: return Stock(title=title, copies=INVENTORY.get(title, 0)) async def estimate_delivery(stock: Annotated[Stock, Resolve(check_stock)]) -> str: return "tomorrow" if stock.copies > 0 else "in 2-3 weeks" @mcp.tool() async def order_book( title: str, stock: Annotated[Stock, Resolve(check_stock)], delivery: Annotated[str, Resolve(estimate_delivery)], ) -> str: """Order a book from the shop.""" if stock.copies == 0: return f"{title!r} is on backorder; it would arrive {delivery}." return f"Ordered {title!r}; it arrives {delivery}." ``` * `estimate_delivery` depends on `check_stock`. The SDK runs the graph in order: stock first, then the estimate, then the tool. * Both `stock` and `delivery` ultimately need `check_stock`, but it runs **once per call**. One inventory lookup, two consumers. * There is nothing to register. The graph *is* the annotations. !!! check Don't take once-per-call on faith. Put a `print` in `check_stock` and call `order_book` from the Inspector: one line per call. Two consumers, one lookup. The SDK analyses the graph when the tool is registered, not when it is called. A parameter it can't classify - not a `Context`, not a `Resolve(...)`, not a tool argument's name - and a cycle of resolvers both raise `InvalidSignature` at startup. Your server fails before a client ever connects, with the offending parameter or resolver named in the error. A resolver's parameters resolve exactly like a tool's: another `Resolve(...)`, the tool's own arguments by name, or the `Context` - `ctx.headers`, the lifespan object, all of it. !!! warning On HTTP transports the `Context` includes `ctx.headers`. Headers are **client-supplied input**, like any tool argument: fine for a locale or a feature flag, never an identity. Who the caller is comes from your authorization layer (**[Authorization](https://py.sdk.modelcontextprotocol.io/v2/advanced/authorization/index.md)**), not from a header anyone can set. !!! tip *Once per call* means exactly that: the next `tools/call` runs `check_stock` again. A resource that should outlive a request - a database pool, an HTTP client - belongs in **[Lifespan](https://py.sdk.modelcontextprotocol.io/v2/tutorial/lifespan/index.md)**, and a resolver can reach it through `ctx.request_context.lifespan_context`. ## Ask when you must A resolver doesn't have to know the answer. It can return `Elicit(message, Model)` and the SDK asks the user - the **[Elicitation](https://py.sdk.modelcontextprotocol.io/v2/tutorial/elicitation/index.md)** machinery, run for you: ```python title="server.py" hl_lines="26-32 39" # docs_src/dependencies/tutorial003.py from typing import Annotated from pydantic import BaseModel, Field from mcp.server import MCPServer from mcp.server.mcpserver import Elicit, Resolve mcp = MCPServer("Bookshop") INVENTORY = {"Dune": 7, "Neuromancer": 0} class Stock(BaseModel): title: str copies: int class Backorder(BaseModel): confirm: bool = Field(description="Order anyway and wait?") async def check_stock(title: str) -> Stock: return Stock(title=title, copies=INVENTORY.get(title, 0)) async def confirm_backorder( title: str, stock: Annotated[Stock, Resolve(check_stock)], ) -> Backorder | Elicit[Backorder]: if stock.copies > 0: return Backorder(confirm=True) # in stock: nothing to ask return Elicit(f"{title!r} is out of stock (2-3 weeks). Order anyway?", Backorder) @mcp.tool() async def order_book( title: str, stock: Annotated[Stock, Resolve(check_stock)], backorder: Annotated[Backorder, Resolve(confirm_backorder)], ) -> str: """Order a book from the shop.""" if not backorder.confirm: return "No order placed." if stock.copies == 0: return f"Backordered {title!r}; it ships in 2-3 weeks." return f"Ordered {title!r}." ``` * In stock: `confirm_backorder` returns a `Backorder` directly. **No question, no round-trip.** The user is only interrupted when their answer matters. * Out of stock: the SDK sends the elicitation, validates the answer against `Backorder`, and injects it. Your resolver never touches the protocol. * The tool reads `backorder.confirm` like any other argument. Answering **no** is still an answer: the elicitation is accepted with `confirm=False`, the tool runs, and no order is placed. Asking became a precondition, not plumbing in the tool body. And if the user won't answer at all - declines the question, or cancels it? !!! check Run `order_book` for `Neuromancer` and decline the question. With the annotation written as `Annotated[Backorder, Resolve(...)]` the tool body never runs; the call fails with an error result the model can read: ```text Error executing tool order_book: Resolver for parameter 'backorder' could not resolve: elicitation was decline ``` That's the right default for a precondition: no answer, no order. When declining is an outcome your tool wants to handle - skip the backorder but still suggest another title - annotate `ElicitationResult[Backorder]` instead and the tool receives the full accept/decline/cancel outcome to branch on. **[Elicitation](https://py.sdk.modelcontextprotocol.io/v2/tutorial/elicitation/index.md)** shows that form, and everything else about asking: the schema rules, the three answers, the client's side of the conversation. !!! info The framework picks the question's transport from the negotiated protocol version; the code above is identical on both. On **2026-07-28** and later the question rides inside a multi-round-trip `tools/call` - the server returns it, the client's `elicitation_callback` answers it, and the `Client` retries the call for you (**[Multi-round-trip requests](https://py.sdk.modelcontextprotocol.io/v2/advanced/multi-round-trip/index.md)**). On **2025-11-25** and earlier it is a synchronous elicitation request mid-call. Each question is asked exactly once per call - a guarantee about the question, not the resolver. In the multi-round-trip form any resolver may run again whenever the call resumes after a question, so code before a `return Elicit(...)` runs on each of those rounds; the recorded answer then satisfies the repeated question without prompting the user again. A recorded answer is only ever consulted when the resolver asks; a resolver that answers *without* asking, like `check_stock`, always supplies its own computed value. Because each answer is matched back to its question, an eliciting resolver must derive its question deterministically from the tool's arguments and earlier answers. A per-call generated value (a `default_factory` id, a timestamp) is re-derived on each round and must not appear in a question the answer is meant to bind to. A question built from such volatile data makes every recorded answer look stale, so the server re-asks it on every round until the client's round limit ends the call. ## Recap * `Annotated[T, Resolve(fn)]` on a tool parameter: the SDK runs `fn` and injects its return value. * A resolved parameter is invisible to the model and cannot be supplied by a client. Values the model must not invent - prices, identities, permissions - belong here. * A resolver's parameters are resolved the same way: the `Context`, another `Resolve(...)`, or a tool argument by name. The graph runs each resolver at most once per round, however many consumers it has; each question is asked exactly once, and any resolver may run again when a call resumes after a question. * Bad graphs fail at registration with `InvalidSignature`, not mid-call. * Return `Elicit(message, Model)` to ask the user, only when you have to. Unwrapped annotations abort on decline; `ElicitationResult[T]` lets the tool branch. Next: what happens when your tool fails, and how to choose who finds out, in **[Handling errors](https://py.sdk.modelcontextprotocol.io/v2/tutorial/handling-errors/index.md)**. # Handling errors Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/handling-errors/ A tool can fail in two ways, and the SDK treats them very differently. Raise an ordinary exception and the **model** sees it. Raise `MCPError` and the **protocol** sees it. This chapter is about choosing. ## An error the model can fix Take a tool that looks something up, and let the lookup miss: ```python title="server.py" hl_lines="11-12" # docs_src/handling_errors/tutorial001.py from mcp.server import MCPServer mcp = MCPServer("Bookshop") CATALOG = {"Dune": "Frank Herbert", "Neuromancer": "William Gibson"} @mcp.tool() def get_author(title: str) -> str: """Look up the author of a book in the catalog.""" if title not in CATALOG: raise ValueError(f"No book titled {title!r} in the catalog.") return CATALOG[title] ``` There is nothing MCP about those two lines. `get_author` raises a plain `ValueError`, the way any Python function would. Call it with a title that isn't in the catalog and look at the result: ```python result.is_error # True result.content # [TextContent(text="Error executing tool get_author: No book titled 'Nothing' in the catalog.")] result.structured_content # None ``` * The request **succeeded**. There is a result; nothing was raised at the caller. * `is_error` is `True`, and your exception's message (prefixed with the tool name) is in `content`, exactly where the model reads. * `structured_content` is `None`. A failed call has no return value to structure. This is a **tool error**, and it is the default for *any* exception your tool raises. It is also almost always what you want. The model is the one calling your tool. It picked the arguments. So a tool error is a turn in the conversation: the model reads *"No book titled 'Nothing' in the catalog."*, realises it guessed the title wrong, and calls again with a better one. You wrote one `raise` and got a self-correcting agent. !!! tip Never `return` an error message from a tool. A returned string has `is_error=False`, so to the model (and to every client UI) it looks like the tool worked and that string was the answer. `raise`. The flag is the signal. ## An error the model cannot fix Now swap `ValueError` for `MCPError`. ```python title="server.py" hl_lines="1 3 15" # docs_src/handling_errors/tutorial002.py from mcp_types import INVALID_PARAMS from mcp import MCPError from mcp.server import MCPServer mcp = MCPServer("Bookshop") CATALOG = {"Dune": "Frank Herbert", "Neuromancer": "William Gibson"} @mcp.tool() def get_author(title: str) -> str: """Look up the author of a book in the catalog.""" if title not in CATALOG: raise MCPError(code=INVALID_PARAMS, message=f"No book titled {title!r} in the catalog.") return CATALOG[title] ``` `MCPError` is the SDK's **protocol error**. It is the one exception the tool wrapper does *not* catch: it propagates, and the whole `tools/call` request fails with a JSON-RPC error instead of a result. ```json { "code": -32602, "message": "No book titled 'Nothing' in the catalog." } ``` * There is **no result**. No `content`, no `is_error`: nothing for the model to read. * The **host** application gets the error instead, the same way it would if the tool didn't exist at all. * `code`, `message`, and `data` arrive intact. `INVALID_PARAMS` is `-32602`; `mcp_types` exports it and the other JSON-RPC error codes (`INVALID_REQUEST`, `INTERNAL_ERROR`, ...) as constants so you never type a magic number. !!! check Same lookup, same miss, but now the call *raises* on the client side instead of returning: ```text mcp.shared.exceptions.MCPError: No book titled 'Nothing' in the catalog. ``` The first version handed the model a sentence it could react to. This one hands it nothing. For `get_author` that is strictly worse, which is the point of the next section. ## Which one to raise The two paths answer two different questions. * **Raise any exception** for a failure of *execution*: the thing your tool tried to do didn't work. The model chose the call, so the model should see the consequence and get a chance to recover. A misspelled title, an upstream API that timed out, a row that doesn't exist: all tool errors. * **Raise `MCPError`** when the *request itself* should be rejected: the client is missing a capability your tool depends on, the server isn't in a state to serve anyone, the caller skipped a required step. No retry from the model fixes any of those, so there is nothing to gain from handing it the message. One question decides it: **could a smarter model have avoided this?** Yes -> ordinary exception. No -> `MCPError`. By that test, the second version of `get_author` made the wrong choice: a better title fixes it, so the model deserved to see the message. It's there to show you the mechanism, not to recommend it. !!! info `MCPError` lives at `from mcp import MCPError` and takes `code`, `message`, and an optional `data` payload. Whatever you put in them is what the client receives: the SDK forwards a raised `MCPError` verbatim instead of sanitising it. ## A resource that doesn't exist Resources draw the same line, and ship one named exception for the common case. ```python title="server.py" hl_lines="2 13" # docs_src/handling_errors/tutorial003.py from mcp.server import MCPServer from mcp.server.mcpserver.exceptions import ResourceNotFoundError mcp = MCPServer("Bookshop") CATALOG = {"Dune": "Frank Herbert", "Neuromancer": "William Gibson"} @mcp.resource("books://{title}") def book(title: str) -> str: """The catalog entry for one book.""" if title not in CATALOG: raise ResourceNotFoundError(f"No book titled {title!r} in the catalog.") return f"{title} by {CATALOG[title]}" ``` `books://{title}` is a **template**. It matches *any* title, so "the URI is well-formed" and "the book exists" are two different questions, and only your function can answer the second one. When it can't, raise `ResourceNotFoundError`. The SDK turns it into the protocol error the spec assigns to a missing resource: `-32602` with the requested URI in `data`, so the client knows *which* read failed. ```json { "code": -32602, "message": "No book titled 'Nothing' in the catalog.", "data": {"uri": "books://Nothing"} } ``` Notice there is no `is_error=True` half-result here. A resource read either returns contents or fails: resources have only the protocol path. Templates and everything else about resources live in **[Resources](https://py.sdk.modelcontextprotocol.io/v2/tutorial/resources/index.md)**. ## Errors you never raise A bad argument never reaches your function. Send `get_author` a `title` that isn't a string and the SDK rejects it against the input schema **before** calling you, as the same kind of `is_error=True` tool error the model can read and correct. You saw this in **[Tools](https://py.sdk.modelcontextprotocol.io/v2/tutorial/tools/index.md)** with `Field(le=50)`. It means a whole class of `raise` statements you don't write: don't re-validate your own type hints. !!! info Everything on this page is what a **client** sees, and the in-memory `Client` you'll write tests with sees exactly the same thing. Even `raise_exceptions=True` doesn't turn a tool error back into a traceback: by the time that flag could act, your exception is already the `is_error=True` result. Assert on the result. **[Testing](https://py.sdk.modelcontextprotocol.io/v2/tutorial/testing/index.md)** covers the pattern. ## Recap * Raise **any exception** in a tool -> the call returns `is_error=True` with your message in `content`. The model reads it and can retry. This is the default. * Raise **`MCPError`** -> the call itself fails with a JSON-RPC error. The model sees nothing; the host deals with it. `code`, `message`, and `data` survive intact. * The deciding question: *could a smarter model have avoided this?* Yes -> exception. No -> `MCPError`. * `ResourceNotFoundError` from a resource handler -> the protocol's `-32602`, with the URI in `data`. * Bad arguments are rejected against the schema before your function runs; you don't `raise` for those. * `from mcp import MCPError`; the error-code constants come from `mcp_types`. Errors handled. Next: the things your server sets up once, before the first call ever arrives, the **[Lifespan](https://py.sdk.modelcontextprotocol.io/v2/tutorial/lifespan/index.md)**. # Lifespan Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/lifespan/ Most real servers hold something for their whole life: a database pool, an HTTP client, a loaded model. You don't want to build it on every call, and you do want to close it cleanly. That's what the **lifespan** is for. ## A typed lifespan A lifespan is an `@asynccontextmanager` that receives the server and `yield`s **one object**. Whatever you yield is available to every handler for as long as the server runs. ```python title="server.py" hl_lines="25-31 34 38 40" # docs_src/lifespan/tutorial001.py from collections.abc import AsyncIterator from contextlib import asynccontextmanager from dataclasses import dataclass from mcp.server import MCPServer from mcp.server.mcpserver import Context class Database: @classmethod async def connect(cls) -> "Database": return cls() async def disconnect(self) -> None: ... def query(self) -> int: return 3 @dataclass class AppContext: db: Database @asynccontextmanager async def app_lifespan(server: MCPServer) -> AsyncIterator[AppContext]: db = await Database.connect() try: yield AppContext(db=db) finally: await db.disconnect() mcp = MCPServer("Bookshop", lifespan=app_lifespan) @mcp.tool() def count_books(genre: str, ctx: Context[AppContext]) -> str: """Count the books in a genre.""" db = ctx.request_context.lifespan_context.db return f"{db.query()} books in {genre!r}." ``` Read it bottom-up: * `app_lifespan` connects the `Database` **before** the `yield` and disconnects it **after**, in a `finally`. That's startup and shutdown. * It yields an `AppContext`, a plain dataclass holding the things you set up. One field today, ten tomorrow. * `MCPServer("Bookshop", lifespan=app_lifespan)` is the whole wiring. * Inside the tool, the yielded object is `ctx.request_context.lifespan_context`. The lifespan runs **once**. It is entered when the server starts (before the first request) and exited when the server stops. Every request in between shares the same `AppContext`. !!! info If you've written a FastAPI `lifespan`, you already know this. Same decorator, same `yield`, same `finally`. ### What the model sees Nothing new. `ctx` is a **Context** parameter, so the SDK injects it and it never reaches the input schema: ```json { "type": "object", "properties": { "genre": {"title": "Genre", "type": "string"} }, "required": ["genre"], "title": "count_booksArguments" } ``` `genre` is the only argument the model can pass. The lifespan is your server's business. `@mcp.resource()` and `@mcp.prompt()` functions can take a `ctx` parameter too, written as a bare `Context` for a reason the next section gets to. Everything `ctx` carries is in **[The Context](https://py.sdk.modelcontextprotocol.io/v2/tutorial/context/index.md)**. ### It really is typed Look at the annotation again: `ctx: Context[AppContext]`. That one type parameter is why `ctx.request_context.lifespan_context` **is** an `AppContext` to your type checker. `.db` autocompletes; `.dbb` is an error before you ever run the server. Write a bare `Context` instead and `lifespan_context` is typed as `dict[str, Any]`: the type checker has no way to know what your lifespan yielded. The object is still there at runtime; you've lost the help. !!! warning `Context[AppContext]` is a **tool-only** spelling. Put it on an `@mcp.resource()` or `@mcp.prompt()` function and every call to that handler fails. The client gets an error back, and the server log shows why: ```text Context is not available outside of a request ``` In resources and prompts, write the bare `ctx: Context`. The object your lifespan yielded is still `ctx.request_context.lifespan_context` at runtime; you give up the type parameter, not the object. !!! tip There is always a lifespan. If you don't pass one, the SDK's default yields an empty `dict`, so `ctx.request_context.lifespan_context` is `{}`, never `None`. That default is also why a bare `Context` types it as `dict[str, Any]`. ## Watch it happen "Startup runs before the first request" is the kind of sentence you should not have to take on faith. Strip the server down to the lifecycle: give `Database` a `connected` flag, flip it in `connect()` and `disconnect()`, and add a tool that reports it. ```python title="server.py" hl_lines="11 14 17 25 44" # docs_src/lifespan/tutorial002.py from collections.abc import AsyncIterator from contextlib import asynccontextmanager from dataclasses import dataclass from mcp.server import MCPServer from mcp.server.mcpserver import Context class Database: def __init__(self) -> None: self.connected = False async def connect(self) -> None: self.connected = True async def disconnect(self) -> None: self.connected = False @dataclass class AppContext: db: Database database = Database() @asynccontextmanager async def app_lifespan(server: MCPServer) -> AsyncIterator[AppContext]: await database.connect() try: yield AppContext(db=database) finally: await database.disconnect() mcp = MCPServer("Bookshop", lifespan=app_lifespan) @mcp.tool() def database_status(ctx: Context[AppContext]) -> str: """Report whether the database connection is up.""" db = ctx.request_context.lifespan_context.db return "connected" if db.connected else "disconnected" ``` `database` lives at module level for one reason: so you can look at it from *outside* the server. !!! check Three moments, three values: * Before the server starts, `database.connected` is `False`. Importing the module connected nothing. * While it's running, call `database_status` and the result is `"connected"`. * Stop the server and the `finally` block runs: `database.connected` is `False` again. The work happened exactly where you put it: around the `yield`, not at import time and not per request. ## Recap * `lifespan=` takes an `@asynccontextmanager` that receives the server and `yield`s one object. * Code before the `yield` is startup. The `finally` after it is shutdown. * It runs once, around the whole life of the server, not per request. * Whatever you `yield` is `ctx.request_context.lifespan_context` in every tool, resource, and prompt. * `ctx: Context[AppContext]` makes that access fully typed in tools. Resources and prompts take the bare `Context`. * No `lifespan=` means an empty `dict`, never `None`. Next: tools that return more than text, **[Media](https://py.sdk.modelcontextprotocol.io/v2/tutorial/media/index.md)**. # Media Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/media/ Text is not the only thing a tool can return. The SDK ships two helpers for binary results (**`Image`** and **`Audio`**) and an **`Icon`** type for giving your server, tools, resources, and prompts a face in the client's UI. ## Returning an image Annotate the return type as `Image` and return one: ```python title="server.py" hl_lines="14 16" # docs_src/media/tutorial001.py import base64 from mcp.server import MCPServer from mcp.server.mcpserver import Image mcp = MCPServer("Brand kit") LOGO_PNG = base64.b64decode( "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGOQ9bsBAAHPAURf8l/aAAAAAElFTkSuQmCC" ) @mcp.tool() def logo() -> Image: """The brand logo as a PNG.""" return Image(data=LOGO_PNG, format="png") ``` * `Image` takes exactly one of `data` (raw bytes) or `path` (a file to read). * `format="png"` becomes the MIME type the client sees: `image/png`. * The bytes here are a one-pixel placeholder so the file runs on its own. In a real server they come from Pillow, matplotlib, a headless browser, or anything else that hands you `bytes`. `Image` is an SDK convenience, not a protocol type. On the wire your return value becomes an **`ImageContent`** block (your bytes base64-encoded, plus the MIME type): ```python result.content # [ImageContent(type="image", data="iVBORw0KGgoAAAANSUhEUg...", mime_type="image/png")] result.structured_content # None ``` Two things to notice: * `data` is base64. You returned raw `bytes`; the SDK did the encoding. * `structured_content` is `None`. An `Image` is content for the model to look at, not data for the application to parse: there is no output schema. (Contrast **[Structured Output](https://py.sdk.modelcontextprotocol.io/v2/tutorial/structured-output/index.md)**, where the return annotation *is* the schema.) !!! info `ImageContent` and `AudioContent` live in `mcp_types`, right next to the `TextContent` you met in **[Tools](https://py.sdk.modelcontextprotocol.io/v2/tutorial/tools/index.md)**. A tool result is a list of content blocks; `Image` and `Audio` are the shortest way to produce the two binary kinds. ### Try it ```console uv run mcp dev server.py ``` Open the **Tools** tab and call `logo`. The result is not a string: it is an `image` content block, and the Inspector renders it as a picture. You returned `bytes`; everything between that and the pixels on screen was the SDK. ## Returning audio `Audio` is the same shape: ```python title="server.py" hl_lines="21-24" # docs_src/media/tutorial002.py import base64 from mcp.server import MCPServer from mcp.server.mcpserver import Audio, Image mcp = MCPServer("Brand kit") LOGO_PNG = base64.b64decode( "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGOQ9bsBAAHPAURf8l/aAAAAAElFTkSuQmCC" ) CHIME_WAV = base64.b64decode("UklGRjQAAABXQVZFZm10IBAAAAABAAEAQB8AAIA+AAACABAAZGF0YRAAAAAAAAAAAAAAAAAAAAAAAAAA") @mcp.tool() def logo() -> Image: """The brand logo as a PNG.""" return Image(data=LOGO_PNG, format="png") @mcp.tool() def chime() -> Audio: """The notification chime as a WAV.""" return Audio(data=CHIME_WAV, format="wav") ``` The result is an **`AudioContent`** block: ```python result.content # [AudioContent(type="audio", data="UklGRjQAAABXQVZFZm1...", mime_type="audio/wav")] result.structured_content # None ``` Same deal: raw bytes in, base64 and a MIME type out, no output schema. ## Bytes or a file Both helpers also accept `path=` instead of `data=`. The file is read when the result is built, and the MIME type is guessed from the suffix: * `Image`: `.png`, `.jpg`, `.jpeg`, `.gif`, `.webp`. * `Audio`: `.wav`, `.mp3`, `.ogg`, `.flac`, `.aac`, `.m4a`. A suffix it doesn't recognise falls back to `application/octet-stream`. !!! check With `data=` there is no filename, so there is nothing to guess from. Forget `format=` and the SDK falls back to a default: `image/png` for images, `audio/wav` for audio. Build an `Audio` from MP3 bytes that way and the client is told `mime_type="audio/wav"`, then faithfully fails to decode it. When you pass `data=`, pass `format=`. ## Icons An `Icon` is metadata, not content. It doesn't carry the image; it points at one with a URI, and a client may fetch it and show it next to your server's name, a tool, a resource, or a prompt. ```python title="server.py" hl_lines="5-6 8 11 17" # docs_src/media/tutorial003.py from mcp_types import Icon from mcp.server import MCPServer LOGO = Icon(src="https://example.com/brand-kit.png", mime_type="image/png", sizes=["48x48"]) PALETTE = Icon(src="https://example.com/palette.svg", mime_type="image/svg+xml", sizes=["any"]) mcp = MCPServer("Brand kit", icons=[LOGO]) @mcp.tool(icons=[PALETTE]) def palette() -> list[str]: """The brand colour palette as hex codes.""" return ["#1d4ed8", "#f59e0b", "#10b981"] @mcp.resource("brand://guidelines", icons=[LOGO]) def guidelines() -> str: """How to use the brand assets.""" return "Use the primary colour for calls to action." ``` * `src` is a URI the client can resolve: `https:`, or a `data:` URI if you want the icon embedded with no extra fetch. * `mime_type` and `sizes` (`"48x48"`, or `"any"` for a scalable format) let the client pick the right one when you offer several. * `theme="light"` or `theme="dark"` marks an icon for one colour scheme. The same `icons=[...]` keyword is accepted by `MCPServer(...)`, `@mcp.tool()`, `@mcp.resource()`, and `@mcp.prompt()`. ### Where a client sees them Icons travel with whatever they decorate. The server's arrive during the handshake, on `client.server_info`: ```python client.server_info.icons # [Icon(src="https://example.com/brand-kit.png", mime_type="image/png", sizes=["48x48"])] ``` A tool's icons are on the `Tool` object from `tools/list`, a resource's on the `Resource` from `resources/list`, a prompt's on the `Prompt` from `prompts/list`. The field is always called `icons`. ## Recap * Return an `Image` or `Audio` from a tool and the client receives an `ImageContent` / `AudioContent` block: your bytes base64-encoded, with a MIME type. * Build one from in-memory `data=` plus an explicit `format=`, or from a `path=` and let the suffix decide. * Media results carry no `structured_content` and no output schema. * An `Icon` is a pointer: a `src` URI plus optional `mime_type`, `sizes`, and `theme`. * `icons=[...]` works on the server, on tools, on resources, and on prompts, and clients find them on the matching objects. That is everything a tool can put *into* a result. Helping the user fill in a prompt's or a resource template's arguments *before* anything runs is **[Completions](https://py.sdk.modelcontextprotocol.io/v2/tutorial/completions/index.md)**. # Completions Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/completions/ A client building a UI on top of your server wants to autocomplete argument values as the user types: language names, repository names, file paths. **Completions** are how your server supplies those suggestions. ## Something worth completing Completions apply to exactly two things: the arguments of a **prompt** and the parameters of a **resource template**. So start with a server that has one of each: ```python title="server.py" hl_lines="6 12" # docs_src/completions/tutorial001.py from mcp.server import MCPServer mcp = MCPServer("GitHub Explorer") @mcp.resource("github://repos/{owner}/{repo}") def github_repo(owner: str, repo: str) -> str: """A GitHub repository.""" return f"Repository: {owner}/{repo}" @mcp.prompt() def review_code(language: str, code: str) -> str: """Review a snippet of code.""" return f"Review this {language} code:\n{code}" ``` Nothing here is about completions yet. * `review_code` takes a `language`. A user shouldn't have to guess which spellings you accept. * `github_repo` takes an `owner` and a `repo`. Free-text boxes for both make a bad form. ## The completion handler Add **one** function decorated with `@mcp.completion()`: ```python title="server.py" hl_lines="22-30" # docs_src/completions/tutorial002.py from mcp_types import Completion, CompletionArgument, CompletionContext, PromptReference, ResourceTemplateReference from mcp.server import MCPServer mcp = MCPServer("GitHub Explorer") LANGUAGES = ["go", "javascript", "python", "rust", "typescript"] @mcp.resource("github://repos/{owner}/{repo}") def github_repo(owner: str, repo: str) -> str: """A GitHub repository.""" return f"Repository: {owner}/{repo}" @mcp.prompt() def review_code(language: str, code: str) -> str: """Review a snippet of code.""" return f"Review this {language} code:\n{code}" @mcp.completion() async def handle_completion( ref: PromptReference | ResourceTemplateReference, argument: CompletionArgument, context: CompletionContext | None, ) -> Completion | None: if isinstance(ref, PromptReference) and argument.name == "language": return Completion(values=[lang for lang in LANGUAGES if lang.startswith(argument.value)]) return None ``` * There is one handler per server. Every completion request lands here, and you branch on what's being completed. * It must be `async def`: the SDK awaits it. * It receives three arguments: * `ref`: *which* prompt or resource template, as a `PromptReference` or a `ResourceTemplateReference`. `isinstance` is how you tell them apart. * `argument`: `argument.name` is the argument being completed, `argument.value` is what the user has typed so far. * `context`: the arguments already resolved. Ignore it for now. * You return a `Completion(values=[...])`, or `None` when you have nothing to offer. !!! tip `argument.value` is the prefix the user has typed. The SDK does **not** filter for you: whatever you put in `values` is what the UI shows. The `startswith` is yours to write. ### Try it Drive it with the in-memory `Client`, the same one you use in **[Testing](https://py.sdk.modelcontextprotocol.io/v2/tutorial/testing/index.md)**. Call `client.complete()` with `ref=PromptReference(name="review_code")` and `argument={"name": "language", "value": "py"}`: ```python result.completion.values # ['python'] ``` * `ref` is the same reference type your handler receives. * `argument` is a plain dict with exactly two keys, `name` and `value`. Send an empty `value` and you get the whole list back. `lang.startswith("")` is true for every language: ```python result.completion.values # ['go', 'javascript', 'python', 'rust', 'typescript'] ``` Ask about `code` (an argument your handler doesn't recognise) and it returns `None`, which the SDK turns into an empty list: ```python result.completion.values # [] ``` `None` means *"no suggestions"*, never an error. A UI falls back to a plain text box. ## A capability you never declared Registering the handler is the declaration. Connect a client and look: ```python client.server_capabilities.completions # CompletionsCapability() ``` You didn't list `completions` anywhere. The SDK saw the handler and advertised it during the handshake. Every *optional* capability works this way: the handler is the declaration. (The three primitives are not optional: `MCPServer` always declares those, handlers or not.) !!! check Go back to the first `server.py` (the one with no handler) and ask it anyway. The call fails with a JSON-RPC error: ```text Method not found ``` And `client.server_capabilities.completions` is `None`. That's the point of the capability: a well-behaved client checks it and never sends the request you can't answer. ## Dependent arguments `github://repos/{owner}/{repo}` has two parameters, and the useful values for `repo` depend on which `owner` was picked first. That's what `context` is for. It carries the arguments the user has **already resolved**: ```python title="server.py" hl_lines="9-12 35-39" # docs_src/completions/tutorial003.py from mcp_types import Completion, CompletionArgument, CompletionContext, PromptReference, ResourceTemplateReference from mcp.server import MCPServer mcp = MCPServer("GitHub Explorer") LANGUAGES = ["go", "javascript", "python", "rust", "typescript"] REPOS_BY_OWNER = { "modelcontextprotocol": ["python-sdk", "typescript-sdk", "inspector"], "pydantic": ["pydantic", "pydantic-ai", "logfire"], } @mcp.resource("github://repos/{owner}/{repo}") def github_repo(owner: str, repo: str) -> str: """A GitHub repository.""" return f"Repository: {owner}/{repo}" @mcp.prompt() def review_code(language: str, code: str) -> str: """Review a snippet of code.""" return f"Review this {language} code:\n{code}" @mcp.completion() async def handle_completion( ref: PromptReference | ResourceTemplateReference, argument: CompletionArgument, context: CompletionContext | None, ) -> Completion | None: if isinstance(ref, PromptReference) and argument.name == "language": return Completion(values=[lang for lang in LANGUAGES if lang.startswith(argument.value)]) if isinstance(ref, ResourceTemplateReference) and argument.name == "repo": if context is None or context.arguments is None: return None repos = REPOS_BY_OWNER.get(context.arguments.get("owner", ""), []) return Completion(values=[repo for repo in repos if repo.startswith(argument.value)]) return None ``` * The new branch fires for the template's `repo` parameter. * `context.arguments` is a `dict[str, str] | None` of the values picked so far (here, `owner`). * No `owner` yet means no sensible suggestions, so the handler returns `None`. The client sends those resolved values with `context_arguments=`. This time `ref` is a `ResourceTemplateReference(uri="github://repos/{owner}/{repo}")`. Ask for `repo` with an empty `value` and pass `context_arguments={"owner": "modelcontextprotocol"}`: ```python result.completion.values # ['python-sdk', 'typescript-sdk', 'inspector'] ``` Drop `context_arguments=` and the same call returns `[]`. The handler can't know which repos to offer until it knows the owner. !!! info `Completion` also takes `total=` and `has_more=`. Set them when `values` is a slice of a longer list, so a UI can show *"and 200 more"*. Most handlers never need them. ## Recap * Completions are suggestions for **prompt arguments** and **resource template parameters**. Nothing else. * `@mcp.completion()` registers the one handler. It's `async def (ref, argument, context) -> Completion | None`. * Branch on `isinstance(ref, ...)` and on `argument.name`. Filter by `argument.value` yourself. * `None` becomes an empty list. It is never an error. * `context.arguments` holds the already-resolved values; the client supplies them as `context_arguments=`. * The `completions` capability appears the moment you register the handler. Without it, the request is `Method not found`. Suggestions help *before* a tool runs. To ask the user a question in the *middle* of one, you want **[Elicitation](https://py.sdk.modelcontextprotocol.io/v2/tutorial/elicitation/index.md)**. # Elicitation Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/elicitation/ A tool that is halfway through its job and missing one answer doesn't have to fail. **Elicitation** lets it ask. In the middle of a tool call the server sends the client a question, the client puts it to the user, and the answer comes back into the same function call. There are two modes: * **Form mode**: you need a value (a confirmation, a date, a quantity). You describe the fields, the client renders the form. * **URL mode**: you need the user to go somewhere else (an OAuth consent screen, a payment page). Nothing they do there passes through the protocol. ## Ask with a form `ctx.elicit()` takes a message and a Pydantic model: ```python title="server.py" hl_lines="9-11 20-23 25" # docs_src/elicitation/tutorial001.py from pydantic import BaseModel, Field from mcp.server import MCPServer from mcp.server.mcpserver import Context mcp = MCPServer("Bistro") class AlternativeDate(BaseModel): accept_alternative: bool = Field(description="Try another date?") date: str = Field(default="2025-12-26", description="Alternative date (YYYY-MM-DD)") @mcp.tool() async def book_table(date: str, party_size: int, ctx: Context) -> str: """Book a table at the bistro.""" if date != "2025-12-25": return f"Booked a table for {party_size} on {date}." result = await ctx.elicit( message=f"No tables for {party_size} on {date}. Would you like to try another date?", schema=AlternativeDate, ) if result.action == "accept" and result.data.accept_alternative: return await book_table(result.data.date, party_size, ctx) return "No booking made." ``` * The **`Context`** parameter is what gives you `ctx.elicit`; any tool can take one. That object has its own chapter: **[The Context](https://py.sdk.modelcontextprotocol.io/v2/tutorial/context/index.md)**. * `AlternativeDate` is the **schema** of the answer you want. * The tool is `async def`. It has to be: it stops in the middle and waits for a person. * On any other date the tool returns straight away. It only asks when it has to. * The date the user accepts goes back through `book_table` itself. An answer is input like any other: an alternative that is also fully booked gets asked about again, not confirmed blind. ### What the client receives The client gets your message and, next to it, a JSON Schema generated from the model: ```json { "properties": { "accept_alternative": { "description": "Try another date?", "title": "Accept Alternative", "type": "boolean" }, "date": { "default": "2025-12-26", "description": "Alternative date (YYYY-MM-DD)", "title": "Date", "type": "string" } }, "required": ["accept_alternative"], "title": "AlternativeDate", "type": "object" } ``` That schema is the form. `Field(description=...)` is the label; a default pre-fills the input and makes the field optional. It's the same Pydantic-to-JSON-Schema machinery you already used for a tool's arguments in **[Tools](https://py.sdk.modelcontextprotocol.io/v2/tutorial/tools/index.md)**. !!! warning An elicitation schema is not as expressive as a tool's input schema. Flat, primitive fields only: `str`, `int`, `float`, `bool`, or a `Literal` of strings (it becomes an `enum`). Put a model inside the model and `ctx.elicit` raises before anything is sent to the client: ```text TypeError: Elicitation schema field 'address' rendered as {'$ref': '#/$defs/Address'}, which is not a valid PrimitiveSchemaDefinition ``` You are interrupting a person mid-task. If the answer needs nesting, it should have been an argument to the tool. ### The three answers `result.action` tells you what the user did, and there are exactly three possibilities: * `"accept"`: they submitted the form. `result.data` is an `AlternativeDate` instance, already validated. * `"decline"`: they said no. * `"cancel"`: they dismissed the question without choosing. `result.data` only exists on `"accept"`, which is why the example checks `result.action` first. Your type checker enforces the order: after `result.action == "accept"`, `result.data` is an `AlternativeDate`; before it, there is no `.data` at all. A refusal is not an error. The tool decides what declining means (here, no booking) and answers the model normally. !!! tip The answer is validated against your model before your code sees it. A client that sends `"maybe"` for a `bool` doesn't corrupt your booking: the call fails with a schema-mismatch error, your `if` never runs. ## Ask before the tool runs The booking tool above weaves the question into its own body. When the question is really a *precondition* - confirm before deleting, authenticate before acting - you can lift it out of the tool into a **resolver** and let the framework ask for you. A parameter annotated `Annotated[T, Resolve(fn)]` is filled by running `fn` before the tool body. The resolver returns the value directly when it already knows it, or returns `Elicit(...)` to have the framework ask: ```python title="server.py" hl_lines="24-30 35-36" # docs_src/elicitation/tutorial004.py from typing import Annotated from pydantic import BaseModel from mcp.server import MCPServer from mcp.server.mcpserver import ( AcceptedElicitation, CancelledElicitation, DeclinedElicitation, Elicit, ElicitationResult, Resolve, ) mcp = MCPServer("Files") _FOLDERS: dict[str, list[str]] = {"/tmp/empty": [], "/tmp/project": ["main.py", "README.md"]} class Confirm(BaseModel): ok: bool async def confirm_delete(path: str) -> Confirm | Elicit[Confirm]: """Resolver: ask for confirmation only when the folder is not empty.""" file_count = len(_FOLDERS.get(path, [])) if file_count == 0: return Confirm(ok=True) # nothing to confirm, no round-trip to the client return Elicit(f"{path} has {file_count} file(s). Delete anyway?", Confirm) @mcp.tool() async def delete_folder( path: str, confirm: Annotated[ElicitationResult[Confirm], Resolve(confirm_delete)], ) -> str: """Delete a folder, asking for confirmation when it is not empty.""" match confirm: case AcceptedElicitation(data=Confirm(ok=True)): _FOLDERS.pop(path, None) return f"deleted {path}" case AcceptedElicitation(): return "kept the folder" case DeclinedElicitation(): return "declined: folder not deleted" case CancelledElicitation(): return "cancelled: folder not deleted" ``` * `confirm_delete` reads the tool's own `path` argument by name, lists the folder, and **only elicits when it must** - an empty folder resolves to `Confirm(ok=True)` with no round-trip to the client. * `delete_folder` annotates `ElicitationResult[Confirm]`, so the framework injects the whole outcome and the tool `match`es every case: accept-and-confirm, accept-but-keep (`ok=False`), decline, cancel. * The `confirm` parameter never appears in the tool's input schema - the client supplies `path`, the resolver supplies `confirm`. Annotate the unwrapped model (`Annotated[Confirm, Resolve(confirm_delete)]`) instead when the tool doesn't need to branch: it receives the model on accept and the call aborts with an error on decline or cancel. Asking is only one thing a resolver can do. The general mechanism - dependencies that compute without asking, dependencies of dependencies, what the model can and cannot supply - is the **[Dependencies](https://py.sdk.modelcontextprotocol.io/v2/tutorial/dependencies/index.md)** chapter. ## Send the user to a URL Some things must not go through the model or the client: credentials, card numbers, OAuth consent. For those you don't ask for data; you ask the user to go somewhere: ```python title="server.py" hl_lines="10-14 23" # docs_src/elicitation/tutorial002.py from mcp.server import MCPServer from mcp.server.mcpserver import Context mcp = MCPServer("Bistro") @mcp.tool() async def pay_deposit(booking_id: str, ctx: Context) -> str: """Take the deposit that confirms a booking.""" result = await ctx.elicit_url( message="A 20 EUR deposit confirms your booking.", url=f"https://pay.example.com/deposit/{booking_id}", elicitation_id=f"deposit-{booking_id}", ) if result.action == "accept": return "Complete the payment in your browser." return "No deposit taken. The booking expires in one hour." @mcp.tool() async def confirm_deposit(booking_id: str, ctx: Context) -> str: """Record a payment reported by the payment provider.""" await ctx.session.send_elicit_complete(f"deposit-{booking_id}") return f"Deposit received for booking {booking_id}." ``` * `ctx.elicit_url()` takes the message, the **URL** to visit, and an `elicitation_id` you choose: any string that identifies this elicitation within your server. * The result has an action and nothing else. `"accept"` means the user agreed to open the URL, **not** that they finished what's on the other side. * The payment happens out of band, between the user's browser and your payment provider. No content ever comes back through MCP. Look at the second tool. When your server learns the out-of-band flow finished (a webhook, a poll; here it's modelled as a second tool), `ctx.session.send_elicit_complete(...)` sends `notifications/elicitation/complete` with the same `elicitation_id`. That is how the client knows it can stop showing *"waiting for payment..."*. Without it, the client can only guess. ## The client side Servers ask. Clients answer by passing an **`elicitation_callback`** to `Client(...)`: ```python title="client.py" hl_lines="7-8 19" # docs_src/elicitation/tutorial003.py from mcp_types import ElicitRequestParams, ElicitRequestURLParams, ElicitResult from mcp import Client from mcp.client import ClientRequestContext async def handle_elicitation(context: ClientRequestContext, params: ElicitRequestParams) -> ElicitResult: if isinstance(params, ElicitRequestURLParams): print(f"Open this link to continue: {params.url}") return ElicitResult(action="accept") print(params.message) return ElicitResult(action="accept", content={"accept_alternative": True, "date": "2025-12-27"}) async def main() -> None: async with Client( "http://127.0.0.1:8000/mcp", mode="legacy", elicitation_callback=handle_elicitation, ) as client: result = await client.call_tool("book_table", {"date": "2025-12-25", "party_size": 2}) print(result.content) ``` * One callback handles both modes. `params` is a union of `ElicitRequestFormParams` and `ElicitRequestURLParams`; `isinstance` is the branch. * For a URL, you show `params.url` to the user and return the action they chose. Never any `content`. * For a form, a real application renders `params.requested_schema` and returns the user's input as `content`. This one always says yes with a canned answer, which is exactly the callback you want in a test. * Passing the callback is also the **capability declaration**: it's how the server learns this client can be asked. The other things a client can answer for a server live in **[Client callbacks](https://py.sdk.modelcontextprotocol.io/v2/client/callbacks/index.md)**. !!! info Elicitation is a request from the *server* to the *client*, and those only exist on a classic-handshake session, which is why this client passes `mode="legacy"`. On a **2026-07-28** connection a tool asks by *returning* the question from the call instead; that flow is **[Multi-round-trip requests](https://py.sdk.modelcontextprotocol.io/v2/advanced/multi-round-trip/index.md)**. ### Try it Start the form-mode `server.py` (the first one on this page) on Streamable HTTP (**[Running your server](https://py.sdk.modelcontextprotocol.io/v2/run/index.md)** has the one-liner), then run the client's `main()` and ask `book_table` for Christmas day. The callback prints the question it was sent: ```text No tables for 2 on 2025-12-25. Would you like to try another date? ``` It answers with `{"accept_alternative": True, "date": "2025-12-27"}`, and the tool, which has been waiting inside `await ctx.elicit(...)` this whole time, finishes the booking: ```text Booked a table for 2 on 2025-12-27. ``` Now swap in the URL-mode `server.py` and point the same `main()` at `pay_deposit`: the same callback takes the other branch, prints the payment link, and the tool comes back with *"Complete the payment in your browser."* One round trip, mid-call, in both directions. !!! check Now remove `elicitation_callback=` from the `Client` and call `book_table` for Christmas day again. The whole call fails with a protocol error: ```text Elicitation not supported ``` A client that registered no callback never declared the `elicitation` capability, so there is nobody to ask. Your tool didn't get a `"decline"`; it got an exception. Design for it: every elicitation needs a sensible answer to "what if I can't ask?". ## Recap * `await ctx.elicit(message, schema=Model)` asks mid-call; your tool resumes with the answer. * The schema is a flat Pydantic model: primitive fields only, validated on the way back. * `result.action` is `"accept"`, `"decline"` or `"cancel"`; `result.data` exists only on accept. * `await ctx.elicit_url(message, url, elicitation_id)` is for everything that must not pass through the model; `ctx.session.send_elicit_complete(elicitation_id)` says the out-of-band part is done. * The client answers with one `elicitation_callback`, branching on the params type; registering it is what declares the capability. * On a 2026-07-28 connection the server returns the question instead of pushing it; the same callback is fed by **[Multi-round-trip requests](https://py.sdk.modelcontextprotocol.io/v2/advanced/multi-round-trip/index.md)**. A tool that can ask is good. A tool that says how far along it is (**[Progress](https://py.sdk.modelcontextprotocol.io/v2/tutorial/progress/index.md)**) is next. # Progress Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/progress/ A tool that takes thirty seconds and says nothing for thirty seconds looks broken. **Progress notifications** fix that. The tool reports how far along it is; the client decides what to draw with it: a bar, a spinner, a log line. ## Report it from the tool Take a **`Context`** parameter and call `report_progress`: ```python title="server.py" hl_lines="8 11" # docs_src/progress/tutorial001.py from mcp.server import MCPServer from mcp.server.mcpserver import Context mcp = MCPServer("Bookshop") @mcp.tool() async def import_catalog(urls: list[str], ctx: Context) -> str: """Import book records from a list of catalog URLs.""" for done, url in enumerate(urls, start=1): await ctx.report_progress(done, total=len(urls), message=f"Imported {url}") return f"Imported {len(urls)} records." ``` Three arguments, and you decide what they mean: * `progress`: how far you are. The spec requires it to **increase** with every report; never repeat a value or go backwards. * `total`: how much there is in total, if you know. Optional. * `message`: one human-readable line about *this* step. Optional. `ctx` is injected because of its type hint and the model never sees it: `import_catalog`'s input schema has a single property, `urls`. **[The Context](https://py.sdk.modelcontextprotocol.io/v2/tutorial/context/index.md)** chapter is all about that object; progress is one of the things it gives you. ## Listen for it from the client The client opts in **per call**, by passing `progress_callback=` to `call_tool`: ```python title="client.py" hl_lines="7 16" import anyio from mcp import Client from server import mcp async def show(progress: float, total: float | None, message: str | None) -> None: print(f"{message} ({progress}/{total})") async def main() -> None: async with Client(mcp) as client: result = await client.call_tool( "import_catalog", {"urls": ["https://example.com/a.json", "https://example.com/b.json"]}, progress_callback=show, ) print(result.structured_content) anyio.run(main) ``` The callback is an `async` function taking exactly what the server reported: `progress`, `total`, `message`. !!! info `Client(mcp)` connects straight to the server object, in memory, the same client the **[Testing](https://py.sdk.modelcontextprotocol.io/v2/tutorial/testing/index.md)** chapter is built on. `progress_callback` is the same parameter whatever transport the `Client` uses; the *timing* you are about to see is the in-memory connection's. It runs your callback inline, so every report lands before `call_tool` returns. Over a real transport the notifications race the result, and a slow callback can still be running after `call_tool` has returned. ### Try it Put `client.py` next to `server.py` and run it: ```console python client.py ``` ```text Imported https://example.com/a.json (1/2) Imported https://example.com/b.json (2/2) {'result': 'Imported 2 records.'} ``` Every `await ctx.report_progress(...)` on the server became one call to `show` on the client, in order, and both lines printed **before** `call_tool` returned. Progress is not bundled into the result; it streams while the tool is still working. !!! warning `progress_callback` belongs to the **call**, not the `Client`. There is no constructor argument for it, because different calls want different callbacks: one drives a download bar, the next one a log line. !!! check Now delete `progress_callback=show` and run it again: ```text {'result': 'Imported 2 records.'} ``` No error, no warning, same result. `report_progress` is a **no-op when the caller didn't ask for progress**, so you report unconditionally and never have to wonder whether anyone is listening. ## When you don't know the total `total` is for when you know the denominator. Often you don't: you're draining a feed, walking a cursor, downloading something with no length header. Leave it out: ```python title="server.py" hl_lines="20" # docs_src/progress/tutorial002.py from collections.abc import AsyncIterator from mcp.server import MCPServer from mcp.server.mcpserver import Context mcp = MCPServer("Bookshop") async def fetch_records(feed_url: str) -> AsyncIterator[str]: for title in ("Dune", "Neuromancer", "Hyperion"): yield f"{feed_url}#{title}" @mcp.tool() async def import_feed(feed_url: str, ctx: Context) -> str: """Import every record a catalog feed yields.""" imported = 0 async for record in fetch_records(feed_url): imported += 1 await ctx.report_progress(imported, message=f"Imported {record}") return f"Imported {imported} records." ``` The callback receives `total=None`. A client can still show *activity* ("3 imported so far...") but it can't show a percentage. Don't invent a total to get a prettier bar. !!! tip `progress` doesn't have to count anything in particular. Bytes, rows, pages: pick the unit the user would recognise, and only promise a `total` you can keep. ## Recap * `await ctx.report_progress(progress, total=None, message=None)` from any tool that takes a `Context`. * The client passes `progress_callback=` to `call_tool`: per call, never on the `Client`. * The callback is `async (progress, total, message) -> None` and fires while the tool is still running. * No callback on the call means `report_progress` does nothing. Report unconditionally. * Omit `total` when you don't know it; the callback gets `None`. Progress is what a running tool shows the *user*. The lines it logs for *you*, the person operating the server, are a different channel: **[Logging](https://py.sdk.modelcontextprotocol.io/v2/tutorial/logging/index.md)** is next. # Logging Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/logging/ Log from a tool the way you log from any other Python function: with the standard library. MCP has a protocol-level **logging capability**: a server could push its log messages to the client as notifications, through methods on the `Context` object. The 2026-07-28 revision of the spec **deprecates that capability and does not replace it**, so this tutorial doesn't teach it. The full list of what's deprecated and what to do instead is in **[Deprecated features](https://py.sdk.modelcontextprotocol.io/v2/advanced/deprecated/index.md)**. What you do instead is what you do in every other Python program: the standard library. ## A tool that logs ```python title="server.py" hl_lines="1 5 13" # docs_src/logging/tutorial001.py import logging from mcp.server import MCPServer logger = logging.getLogger(__name__) mcp = MCPServer("Bookshop") @mcp.tool() def search_books(query: str) -> str: """Search the catalog by title or author.""" logger.info("Searching for %r", query) return f"Found 3 books matching {query!r}." ``` * `logging.getLogger(__name__)` gives you a logger named after your module. Create it once, at the top. * Inside the tool you call `logger.info(...)` like in any other function. Nothing to inject, nothing to `await`, nothing MCP-specific. !!! check Call the tool and look at the whole result: ```python result.content # [TextContent(text="Found 3 books matching 'dune'.")] result.structured_content # {'result': "Found 3 books matching 'dune'."} ``` The log line is nowhere in it. Logging is for **you**, the person operating the server. The model never sees it. If the model should read something, `return` it. ## Where it goes For a **stdio** server this question matters more than usual. The host launched your server as a subprocess and is reading MCP messages from its **stdout**. Standard error is yours. The standard library already does the right thing: log output goes to `sys.stderr` by default. Your `logger.info(...)` lines land in the terminal (or wherever the host collects the subprocess's stderr), and the protocol stream stays clean. !!! tip Never `print()` in a stdio server. `print` writes to **stdout**, and stdout *is* the wire: one stray line and the client is trying to parse it as JSON-RPC. `logger.debug("got here")` is the same one line of effort and goes to the right place. ## The level You don't have to call `logging.basicConfig()` yourself. Constructing an `MCPServer` already did, with a handler pointed at standard error, at the level you pass as `log_level=`, so `MCPServer("Bookshop", log_level="DEBUG")` is all it takes to see your `logger.debug(...)` lines. The default is `"INFO"`. `logging.basicConfig()` never replaces handlers that already exist. If you configure logging yourself before creating the server, your configuration wins. ## Try it Run the server with the MCP Inspector: ```console uv run mcp dev server.py ``` Call `search_books` from the **Tools** tab. The Inspector shows you the result: only the return value. The line ```text Searching for 'dune' ``` went to standard error: the terminal, not the wire. !!! info If what you actually want is *tracing* (every request, how long it took, whether it failed), you don't want log lines, you want spans. Your server already emits them: the SDK traces every message with OpenTelemetry out of the box. See **[OpenTelemetry](https://py.sdk.modelcontextprotocol.io/v2/advanced/opentelemetry/index.md)**. ## Recap * The MCP protocol's logging capability is deprecated by the 2026-07-28 spec and not replaced. Don't build on it. * `logger = logging.getLogger(__name__)` at module level, `logger.info(...)` in the tool. That's the whole pattern. * Log output never reaches the model. Only the value you `return` does. * Standard error is yours; stdout belongs to the protocol. Never `print()` in a stdio server. * `MCPServer(..., log_level="DEBUG")` sets the level, and a logging configuration you made first is left alone. Next: the in-memory client that has been running every example on these pages, and how to point it at your own server, in **[Testing](https://py.sdk.modelcontextprotocol.io/v2/tutorial/testing/index.md)**. # Testing Source: https://py.sdk.modelcontextprotocol.io/v2/tutorial/testing/ The Python SDK ships a `Client` class with an **in-memory transport**: pass it your server object and it connects to it directly. No subprocess. No port. No transport at all. It's the same idea as FastAPI's `TestClient`. ## Basic usage Let's assume you have a simple server with a single tool: ```python title="server.py" # docs_src/testing/tutorial001.py from mcp.server import MCPServer mcp = MCPServer("Calculator") @mcp.tool() def add(a: int, b: int) -> int: """Add two numbers.""" return a + b ``` To run the test below you'll need two extra (development) dependencies: === "uv" ```bash uv add --dev pytest inline-snapshot ``` === "pip" ```bash pip install pytest inline-snapshot ``` !!! info These docs assume you already know [`pytest`](https://docs.pytest.org/en/stable/). [`inline-snapshot`](https://15r10nk.github.io/inline-snapshot/latest/) is what the test below uses to assert on the whole result object in one line. It records the output of a test as the `snapshot(...)` literal you see. If you'd rather not use it, drop the import and assert on the fields you care about (`result.content[0].text == "3"`) like in any other test. Now the test: ```python title="test_server.py" import pytest from inline_snapshot import snapshot from mcp import Client from mcp_types import CallToolResult, TextContent from server import mcp @pytest.fixture def anyio_backend(): # (1)! return "asyncio" @pytest.fixture async def client(): # (2)! async with Client(mcp, raise_exceptions=True) as c: yield c @pytest.mark.anyio async def test_call_add_tool(client: Client): result = await client.call_tool("add", {"a": 1, "b": 2}) assert result == snapshot( CallToolResult( content=[TextContent(type="text", text="3")], structured_content={"result": 3}, ) ) ``` 1. If you are using `trio`, return `"trio"` instead. See the [anyio documentation](https://anyio.readthedocs.io/en/stable/testing.html#specifying-the-backends-to-run-on) for the details. 2. The fixture yields a connected client. Every test that takes `client` gets a fresh in-memory connection to the same server. There you go! You can now extend your tests to cover more scenarios. ## Why `raise_exceptions=True`? Two different things can go wrong, and this flag only touches one of them. An exception inside one of **your tools** is not a protocol failure. It becomes a normal result with `is_error=True`, and the model reads the message. `raise_exceptions` doesn't change that: with or without it, `call_tool` returns the same `is_error=True` result. There's a whole chapter on it: **[Handling errors](https://py.sdk.modelcontextprotocol.io/v2/tutorial/handling-errors/index.md)**. A failure **outside** a tool body is different. On the connection `Client(mcp)` gives you, the server sanitises it into a generic `"Internal server error"` before the client sees it. You should never leak the details of an unexpected crash to a remote caller. In a test that is exactly what you *don't* want, and it is what `raise_exceptions=True` changes: your test sees the real message instead of the sanitised one. Leave it on in tests. It has no meaning in production code. ## In-process by default !!! note `Client(mcp)` connects in-process and is **era-neutral** by default: it probes the server and picks the appropriate protocol path. Pin `mode="legacy"` if your test exercises legacy-specific semantics (sampling or elicitation push, `message_handler`), and drop `raise_exceptions=True` there: a legacy connection never sanitises in the first place, and the flag re-raises the failure inside the server task instead of in your test. That one line is also why the rest of this tutorial can promise you that its examples work: every example file is exercised by the SDK's own test suite through exactly this client. You're using the same tool the SDK uses on itself. The tutorial ends here. Putting your tested server in front of a real client, over a real transport, is **[Running your server](https://py.sdk.modelcontextprotocol.io/v2/run/index.md)**. # Running your server Source: https://py.sdk.modelcontextprotocol.io/v2/run/ `mcp.run()` starts the server. The only decision you make is the **transport**: how the bytes between your server and its client actually move. ## Pick a transport | Transport | What it is | When | |---|---|---| | `stdio` | The host launches your file as a subprocess and speaks over its stdin and stdout. | Local servers. The default. | | `streamable-http` | A real HTTP server listening on a port. | Anything you deploy. | | `sse` | The older HTTP transport. | You don't. | !!! warning SSE was superseded by Streamable HTTP in the 2025-03-26 protocol revision. `mcp.run(transport="sse")` still works, with its own `sse_path=` and `message_path=` options, but it exists for clients that haven't moved. Don't build anything new on it. ## `mcp.run()` ```python title="server.py" hl_lines="12-13" # docs_src/run/tutorial001.py from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.tool() def search_books(query: str) -> str: """Search the catalog by title or author.""" return f"Found 3 books matching {query!r}." if __name__ == "__main__": mcp.run() ``` * `run()` is synchronous. It blocks for the life of the server. * With no argument, the transport is `stdio`. * It sits under `if __name__ == "__main__":` because everything that loads your server (`mcp dev`, `mcp run`, `mcp install`, your tests) **imports** this file. The guard keeps an import from turning into a running server. ### stdio There is nothing to configure. The host starts your file as a child process, writes requests to its stdin, and reads responses from its stdout. Run it yourself and you see the consequence: ```console python server.py ``` Nothing prints, and it doesn't return. It is waiting on stdin for a host to speak first. That also means stdout **is the wire**. A stray `print()` corrupts the stream; the `logging` module writes to stderr and is the right tool. That story is in **[Logging](https://py.sdk.modelcontextprotocol.io/v2/tutorial/logging/index.md)**. ### Try it ```console uv run mcp dev server.py ``` The Inspector does exactly what a real host does: it launches `server.py` as a subprocess and connects to it over stdio. You never gave it a port. There isn't one. ## Streamable HTTP To put the same server on a port instead, name the transport (and its options) in `run()`: ```python title="server.py" hl_lines="13" # docs_src/run/tutorial002.py from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.tool() def search_books(query: str) -> str: """Search the catalog by title or author.""" return f"Found 3 books matching {query!r}." if __name__ == "__main__": mcp.run(transport="streamable-http", port=3001) ``` That one line builds a Starlette app and serves it with uvicorn. Clients connect to `http://127.0.0.1:3001/mcp`. Each transport has its own keyword arguments, all on `run()`: * `host` / `port`: where to listen. Defaults `127.0.0.1` and `8000`. * `streamable_http_path`: where the MCP endpoint lives. Default `/mcp`. * `json_response=True`: answer with plain JSON instead of an SSE stream. * `stateless_http=True`: a fresh transport per request, no session tracking. * `event_store`, `retry_interval`, `transport_security`: resumability and DNS-rebinding protection. They can wait, until you deploy somewhere other than localhost; **[ASGI](https://py.sdk.modelcontextprotocol.io/v2/run/asgi/index.md)** covers `transport_security`. !!! warning Transport options go to `run()`, **not** to `MCPServer(...)`. The constructor describes what your server *is*: name, version, instructions. `run()` describes how it is served. Get it backwards and Python answers before MCP is even involved: ```text TypeError: MCPServer.__init__() got an unexpected keyword argument 'port' ``` `run()` is the short road. The moment you need more (your server mounted inside an existing app, two servers in one process, CORS for browser clients), you build the ASGI app yourself and hand it to any ASGI host. That is **[ASGI](https://py.sdk.modelcontextprotocol.io/v2/run/asgi/index.md)**. ## Server settings A couple of things about running are not about the transport. They are constructor arguments: ```python title="server.py" hl_lines="3" # docs_src/run/tutorial003.py from mcp.server import MCPServer mcp = MCPServer("Bookshop", log_level="DEBUG") @mcp.tool() def search_books(query: str) -> str: """Search the catalog by title or author.""" return f"Found 3 books matching {query!r}." if __name__ == "__main__": mcp.run() ``` * `log_level`: handed to `logging.basicConfig()` the moment `MCPServer(...)` is constructed. That configures the **root** logger, so it sets the level for your own loggers too, not just the SDK's. Default `"INFO"`. * `debug`: forwarded to the Starlette app that the HTTP transports build. Default `False`. Both land on `mcp.settings`, which you can read back at runtime. ## The `mcp` command The `[cli]` extra installs a small command-line tool around all of this. `mcp dev` runs your server under the **MCP Inspector**: ```console uv run mcp dev server.py uv run mcp dev server.py --with pandas --with numpy uv run mcp dev server.py --with-editable . ``` `--with` adds packages to the environment it builds; `--with-editable` installs your own package into it. It needs `npx` on your `PATH`: the Inspector is a Node.js app. `mcp run` imports the file, finds the server object (a module-level `mcp`, `server`, or `app`), and calls `run()` on it: ```console uv run mcp run server.py uv run mcp run server.py:bookshop ``` The `:` suffix names the object when it isn't called `mcp`, `server`, or `app`. Your `if __name__ == "__main__":` block never executes here: `mcp run` calls `run()` itself, and the only option it forwards is `--transport`. `mcp install` registers the server with **Claude Desktop**, so the app launches it for you: ```console uv run mcp install server.py --name "Bookshop" uv run mcp install server.py -v API_KEY=abc123 -f .env ``` `-v KEY=VALUE` and `-f .env` record environment variables in that entry. Claude Desktop starts your server in its own process. Your shell's environment is not there. `mcp version` prints the installed SDK version. !!! tip `mcp dev` and `mcp run` only understand `MCPServer`. If you build with the low-level `Server`, you run it yourself. See **[The low-level Server](https://py.sdk.modelcontextprotocol.io/v2/advanced/low-level-server/index.md)**. ## Recap * A **transport** is how bytes reach your server: `stdio` for a local subprocess, `streamable-http` for a port. SSE is superseded. * `mcp.run()` picks the transport. With no argument it is `stdio`, and it blocks. * Every transport option (`host`, `port`, `streamable_http_path`, ...) is an argument to `run()`, never to `MCPServer(...)`. * Keep `run()` under `if __name__ == "__main__":`. Everything that loads your server imports the file first. * `log_level=` and `debug=` are constructor arguments; they land on `mcp.settings`. * `mcp dev` for the Inspector, `mcp run` to execute a file, `mcp install` for Claude Desktop, `mcp version` for the version. * The transport never changes what your server *is*: all three files on this page expose the identical tool. When `run()` itself is the limit (your server inside an app that already exists), the next step is **[ASGI](https://py.sdk.modelcontextprotocol.io/v2/run/asgi/index.md)**. # ASGI Source: https://py.sdk.modelcontextprotocol.io/v2/run/asgi/ `mcp.run("streamable-http")` starts a web server for you. Sometimes you don't want that: your MCP server is one piece of a larger web application, or you already have an ASGI deployment. For that, `mcp.streamable_http_app()` returns a **Starlette application**. A Starlette app is an ASGI app, so anything that hosts ASGI (uvicorn, Hypercorn, another Starlette, FastAPI) can host your MCP server. ## The app ```python title="server.py" hl_lines="12" # docs_src/asgi/tutorial001.py from mcp.server import MCPServer mcp = MCPServer("Notes") @mcp.tool() def add_note(text: str) -> str: """Save a note.""" return f"Saved: {text}" app = mcp.streamable_http_app() ``` `app` is an ordinary ASGI application. Hand it to any ASGI server: ```console uvicorn server:app ``` The MCP endpoint is at `/mcp`, so a client connects to `http://127.0.0.1:8000/mcp`. The app already carries two things: * One route, `/mcp`: the Streamable HTTP endpoint. * A **lifespan** that starts `mcp.session_manager`, the object that owns every live session's background work. Run the app on its own (`uvicorn server:app`) and you never think about either. !!! tip `streamable_http_app()` takes the same keyword arguments as `mcp.run("streamable-http", ...)`, minus `port`: the port belongs to whatever serves the app. `host` is still accepted but binds nothing here; the next section is what it actually controls. **[Running your server](https://py.sdk.modelcontextprotocol.io/v2/run/index.md)** covers the options themselves. `mcp.sse_app()` does the same for the superseded SSE transport. ## Localhost only, until you say otherwise `streamable_http_app()` cannot know which hostname it will be served behind, so it assumes the safest answer: localhost. With no `transport_security=`, the app switches on **DNS-rebinding protection** and accepts a request only if its `Host` header is `127.0.0.1:`, `localhost:`, or `[::1]:`, and only if its `Origin` header, when there is one, is the `http://` form of the same. For `uvicorn server:app` on your machine that is exactly what you want: it stops a malicious web page from driving your local server through a DNS name it rebound to `127.0.0.1`. It also means that **deployed behind a real hostname, the app rejects every request until you configure it**. The check runs before MCP does, the client sees only a generic transport error, and the reason is a single warning in the *server's* log: ```text 421 Misdirected Request Invalid Host header the Host is not in the allowlist 403 Forbidden Invalid Origin header the Origin is not in the allowlist ``` `transport_security=` is how you configure it. Allowlist what you actually serve: ```python from mcp.server.transport_security import TransportSecuritySettings security = TransportSecuritySettings( allowed_hosts=["mcp.example.com", "mcp.example.com:*"], allowed_origins=["https://app.example.com"], ) app = mcp.streamable_http_app(transport_security=security) ``` * `allowed_hosts` entries are exact strings: `"mcp.example.com"` matches a bare `Host` header and `"mcp.example.com:*"` matches any port. List both. * `allowed_origins` only matters for browsers (nothing else sends `Origin`). It is the server-side twin of the CORS configuration below. * Behind a reverse proxy that already controls the `Host` header, switching the check off is the honest configuration: `TransportSecuritySettings(enable_dns_rebinding_protection=False)`. * Passing a non-localhost `host=` (for example `host="mcp.example.com"`) does **not** allowlist that hostname. It only stops the localhost default from arming the protection, which leaves every Host and Origin accepted. Say what you mean with `transport_security=` instead. ## Mounting it The moment the MCP server is *part* of a bigger application, you put the app inside a `Mount`. And the moment you do that, the lifespan becomes your problem: ```python title="server.py" hl_lines="18-21 25-26" # docs_src/asgi/tutorial002.py from collections.abc import AsyncIterator from contextlib import asynccontextmanager from starlette.applications import Starlette from starlette.routing import Mount from mcp.server import MCPServer mcp = MCPServer("Notes") @mcp.tool() def add_note(text: str) -> str: """Save a note.""" return f"Saved: {text}" @asynccontextmanager async def lifespan(app: Starlette) -> AsyncIterator[None]: async with mcp.session_manager.run(): yield app = Starlette( routes=[Mount("/", app=mcp.streamable_http_app())], lifespan=lifespan, ) ``` * `Mount("/", ...)` plus the default `/mcp` path keeps the endpoint at `/mcp`. Starlette tries routes in order and `Mount("/")` matches **every** path, so your own routes go *before* it in the list. Anything after it is unreachable. * The `lifespan` function enters `mcp.session_manager.run()` for the lifetime of the **host** app. This is the line everyone forgets. * `mcp.session_manager` only exists *after* `streamable_http_app()` has been called. That is why the routes are built at module level and the manager is only touched inside the lifespan. Starlette's `Host` route works the same way: swap `Mount("/", ...)` for `Host("mcp.example.com", ...)` to route by hostname instead of by path. The lifespan rule does not change, and neither does the transport-security one. A `Host("mcp.example.com", ...)` route only ever receives requests addressed to that hostname, so without `allowed_hosts=["mcp.example.com", "mcp.example.com:*"]` it answers every one of them with a `421`. !!! warning "The host app owns the lifespan" `streamable_http_app()` wires `session_manager.run()` into the lifespan of the Starlette it returns, but **a mounted sub-application's lifespan never runs**. Mount the app and that built-in lifespan is dead code. Whichever app sits at the top of your ASGI stack must enter `mcp.session_manager.run()` in its own lifespan. !!! check Delete the `lifespan=lifespan` line and start the server. It starts. The route resolves. Then the first request to `/mcp` fails with: ```text RuntimeError: Task group is not initialized. Make sure to use run(). ``` Nothing starts the session manager except its `run()`. ## Two servers, one app Each `MCPServer` is its own app with its own session manager. Mount as many as you like; enter every manager from the one host lifespan: ```python title="server.py" hl_lines="27-30 35-36" # docs_src/asgi/tutorial003.py from collections.abc import AsyncIterator from contextlib import AsyncExitStack, asynccontextmanager from starlette.applications import Starlette from starlette.routing import Mount from mcp.server import MCPServer notes = MCPServer("Notes") tasks = MCPServer("Tasks") @notes.tool() def add_note(text: str) -> str: """Save a note.""" return f"Saved: {text}" @tasks.tool() def add_task(title: str) -> str: """Create a task.""" return f"Created: {title}" @asynccontextmanager async def lifespan(app: Starlette) -> AsyncIterator[None]: async with AsyncExitStack() as stack: await stack.enter_async_context(notes.session_manager.run()) await stack.enter_async_context(tasks.session_manager.run()) yield app = Starlette( routes=[ Mount("/notes", app=notes.streamable_http_app()), Mount("/tasks", app=tasks.streamable_http_app()), ], lifespan=lifespan, ) ``` * `AsyncExitStack` enters both managers; they start together and shut down in reverse order. * The endpoints are `/notes/mcp` and `/tasks/mcp`: the mount prefix plus the default path. ## Changing the path That trailing `/mcp` is `streamable_http_path`. Set it to `"/"` and the mount prefix becomes the whole public path: ```python title="server.py" hl_lines="25" # docs_src/asgi/tutorial004.py from collections.abc import AsyncIterator from contextlib import asynccontextmanager from starlette.applications import Starlette from starlette.routing import Mount from mcp.server import MCPServer mcp = MCPServer("Notes") @mcp.tool() def add_note(text: str) -> str: """Save a note.""" return f"Saved: {text}" @asynccontextmanager async def lifespan(app: Starlette) -> AsyncIterator[None]: async with mcp.session_manager.run(): yield app = Starlette( routes=[Mount("/notes", app=mcp.streamable_http_app(streamable_http_path="/"))], lifespan=lifespan, ) ``` Now clients connect to `/notes`, not `/notes/mcp`. ## CORS for browser clients A browser-based client needs two permissions from you: to **send** its MCP request headers, and to **read** the one MCP sends back. Both are CORS configuration on the host app, and the transport-security allowlist above has to agree with it: ```python title="server.py" hl_lines="27-30 33 35-49" # docs_src/asgi/tutorial005.py from collections.abc import AsyncIterator from contextlib import asynccontextmanager from starlette.applications import Starlette from starlette.middleware import Middleware from starlette.middleware.cors import CORSMiddleware from starlette.routing import Mount from mcp.server import MCPServer from mcp.server.transport_security import TransportSecuritySettings mcp = MCPServer("Notes") @mcp.tool() def add_note(text: str) -> str: """Save a note.""" return f"Saved: {text}" @asynccontextmanager async def lifespan(app: Starlette) -> AsyncIterator[None]: async with mcp.session_manager.run(): yield security = TransportSecuritySettings( allowed_hosts=["mcp.example.com", "mcp.example.com:*"], allowed_origins=["https://app.example.com"], ) app = Starlette( routes=[Mount("/", app=mcp.streamable_http_app(transport_security=security))], middleware=[ Middleware( CORSMiddleware, allow_origins=["https://app.example.com"], allow_methods=["GET", "POST", "DELETE"], allow_headers=[ "Authorization", "Content-Type", "Last-Event-ID", "Mcp-Method", "Mcp-Name", "Mcp-Protocol-Version", "Mcp-Session-Id", ], expose_headers=["Mcp-Session-Id"], ) ], lifespan=lifespan, ) ``` * `allow_headers` is the half everyone forgets. A browser **preflights** every MCP request, because `Content-Type: application/json` and the `Mcp-*` request headers are not on the CORS safelist, and a header the preflight doesn't grant is a request the browser never sends. (`allow_headers=["*"]` also works: Starlette answers a preflight with whatever it asked for.) * `expose_headers=["Mcp-Session-Id"]` is the read half. Streamable HTTP returns the session ID in that response header, and browsers hide response headers from JavaScript unless CORS exposes them by name. Without it the client can never make its second request. * `allow_origins` is your decision, not MCP's. Be precise, and mirror it in `allowed_origins=` above: the browser enforces CORS, but the server checks `Origin` itself, and an origin the transport doesn't trust gets a `403` even after a clean preflight. * `allow_methods` lists the three methods Streamable HTTP uses: `POST` to send messages, `GET` to open the server-to-client stream, `DELETE` to end the session. ## Custom routes `@mcp.custom_route()` registers a plain HTTP endpoint on the same app, for the things every deployed service needs that have nothing to do with MCP: a health check, an OAuth callback. ```python title="server.py" hl_lines="15-17" # docs_src/asgi/tutorial006.py from starlette.requests import Request from starlette.responses import JSONResponse, Response from mcp.server import MCPServer mcp = MCPServer("Notes") @mcp.tool() def add_note(text: str) -> str: """Save a note.""" return f"Saved: {text}" @mcp.custom_route("/health", methods=["GET"]) async def health(request: Request) -> Response: return JSONResponse({"status": "ok"}) app = mcp.streamable_http_app() ``` * The handler is plain Starlette: an `async` function from `Request` to `Response`. * `streamable_http_app()` picks up every custom route. `app.routes` is now `/mcp` and `/health`. * `GET /health` answers `{"status": "ok"}` with no MCP in sight: no session, no handshake. !!! warning Custom routes are **never authenticated**, even when the rest of the server is. That is deliberate: health checks and OAuth callbacks have to be reachable before any token exists. Don't put anything private behind one. ## Recap * `mcp.streamable_http_app()` returns a Starlette app with one route, `/mcp`. Any ASGI server can run it. * Out of the box the app answers only requests addressed to localhost. Deploying behind a real hostname means passing `transport_security=TransportSecuritySettings(...)`. * `Mount` (or `Host`) puts it inside a bigger Starlette or FastAPI app. * **Mounting disables the built-in lifespan.** The host app's lifespan must enter `mcp.session_manager.run()`, or the first request fails. * Several servers in one app means several mounts and one lifespan that enters every session manager. * `streamable_http_path="/"` moves the endpoint to the mount prefix itself. * Browser clients need CORS: `allow_headers` for the `Mcp-*` request headers, `expose_headers=["Mcp-Session-Id"]` for the response. * `@mcp.custom_route()` adds plain, unauthenticated HTTP endpoints next to `/mcp`. Once the server is reachable at a real URL, **[The Client](https://py.sdk.modelcontextprotocol.io/v2/client/index.md)** connects to it with that URL instead of a server object. # The Client Source: https://py.sdk.modelcontextprotocol.io/v2/client/ A **`Client`** is how a Python program talks to an MCP server. It is one object with one lifecycle: construct it, enter `async with`, call methods. Every protocol verb (list the tools, call one, read a resource, render a prompt) is an `async` method on it that returns a typed result. ## Your first client ```python title="client.py" hl_lines="14-18" # docs_src/client/tutorial001.py from mcp import Client from mcp.server import MCPServer mcp = MCPServer("Bookshop", instructions="Search the catalog before recommending a book.") @mcp.tool() def search_books(query: str) -> str: """Search the catalog by title or author.""" return f"Found 3 books matching {query!r}." async def main() -> None: async with Client(mcp) as client: print(client.server_info) print(client.server_capabilities) print(client.protocol_version) print(client.instructions) ``` The server at the top is only there so you have something to connect to. The client is the five highlighted lines. * `Client(mcp)` is given the **server object itself**. That is the in-memory transport: no subprocess, no port, no HTTP. It is how every example in this chapter, and every test you write, connects. * `async with` is the **lifecycle**. Entering it connects and negotiates; leaving it disconnects. There is no `connect()` / `close()` pair, and a `Client` cannot be reused after the block ends. * Inside the block the connection facts are already there as plain properties. ### What you can pass to `Client` `Client` takes one positional argument and resolves the transport from its type: * An `MCPServer` (or low-level `Server`) instance: connected **in-process**. * A URL string (`Client("http://localhost:8000/mcp")`): Streamable HTTP, the production path. * A **transport**: anything you can `async with ... as (read, write)`, such as `stdio_client(...)` wrapping a subprocess. Everything else on this page is identical across all three. Headers, subprocesses, timeouts, and the `Transport` protocol get their own chapter: **[Client transports](https://py.sdk.modelcontextprotocol.io/v2/client/transports/index.md)**. ### What's on a connected client Four read-only properties, populated the moment you enter the block: * `client.server_info`: the server's identity. `server_info.name` here is `"Bookshop"`, `server_info.version` is whatever the server reports. * `client.server_capabilities`: what the server can do (`tools`, `resources`, `prompts`, `completions`, ...). A capability the server doesn't have is `None`. * `client.protocol_version`: the protocol version the two sides agreed on. Here it is `"2026-07-28"`. * `client.instructions`: the server's `instructions=` string, or `None` if it didn't set one. You never picked a protocol version. By default the `Client` probes the server and falls back to the classic handshake on older ones, so one client works against any era of server. When you need to control that, **[Protocol versions](https://py.sdk.modelcontextprotocol.io/v2/client/protocol-versions/index.md)** has the whole story. !!! tip `client.session` is the underlying `ClientSession`, the low-level escape hatch. You won't need it for anything on this page. ## Listing tools ```python title="client.py" hl_lines="15-20" # docs_src/client/tutorial002.py from mcp import Client from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.tool(title="Search the catalog") def search_books(query: str, limit: int = 10) -> str: """Search the catalog by title or author.""" return f"Found 3 books matching {query!r} (showing up to {limit})." async def main() -> None: async with Client(mcp) as client: result = await client.list_tools() for tool in result.tools: print(tool.name) print(tool.title) print(tool.description) print(tool.input_schema) ``` `list_tools()` returns a `ListToolsResult`; the tools are in `.tools`. Each one is the complete definition a host would hand to a model: ```python tool.name # 'search_books' tool.title # 'Search the catalog' tool.description # 'Search the catalog by title or author.' ``` and `tool.input_schema` is the JSON Schema the server derived from the function's type hints: ```json { "type": "object", "properties": { "query": {"title": "Query", "type": "string"}, "limit": {"default": 10, "title": "Limit", "type": "integer"} }, "required": ["query"], "title": "search_booksArguments" } ``` That schema is everything a UI needs to render an argument form, and everything a model needs to produce valid arguments. !!! tip `title` is optional, so a UI showing tools to a human has to pick: the `title` if there is one, the `name` if not. `from mcp.shared.metadata_utils import get_display_name` does exactly that, for tools, resources, resource templates and prompts. ## Calling a tool `call_tool(name, arguments)` runs the tool and gives you back a `CallToolResult`. ```python title="client.py" hl_lines="26-33" # docs_src/client/tutorial003.py from mcp_types import TextContent from pydantic import BaseModel from mcp import Client from mcp.server import MCPServer mcp = MCPServer("Bookshop") class Book(BaseModel): title: str author: str year: int @mcp.tool() def lookup_book(title: str) -> Book: """Look up a book by its exact title.""" if title != "Dune": raise ValueError(f"No book titled {title!r} in the catalog.") return Book(title="Dune", author="Frank Herbert", year=1965) async def main() -> None: async with Client(mcp) as client: result = await client.call_tool("lookup_book", {"title": "Dune"}) for block in result.content: if isinstance(block, TextContent): print(block.text) print(result.structured_content) print(result.is_error) ``` The server's `lookup_book` returns a Pydantic `Book`. Here is what the client sees: ```python result.content # [TextContent(type='text', text='{\n "title": "Dune",\n "author": "Frank Herbert",\n "year": 1965\n}')] result.structured_content # {'title': 'Dune', 'author': 'Frank Herbert', 'year': 1965} result.is_error # False ``` One return value, three things to read. Each has a different consumer. ### `content`: what the model reads `content` is a `list` of **content blocks**, and a content block is a union: `TextContent`, `ImageContent`, `AudioContent`, `ResourceLink`, or `EmbeddedResource`. A tool can return several, of different kinds. That is why `main` narrows with `isinstance(block, TextContent)` before touching `block.text`. Notice there is no `.text` outside the `isinstance`: the type checker won't allow it, because `ImageContent` has `.data`, not `.text`. The union is honest about what a tool is allowed to send you; your code should be too. ### `structured_content`: what your application reads `structured_content` is the tool's return value as JSON, matching the tool's declared `output_schema`. No string parsing, no guessing. When both are present they say the same thing twice on purpose: `content` is for a model, `structured_content` is for code. Where the structured half comes from, and how to control it, is the **[Structured Output](https://py.sdk.modelcontextprotocol.io/v2/tutorial/structured-output/index.md)** chapter. ### `is_error`: whether the tool failed A tool that raises does **not** raise in your client. It comes back as an ordinary result with `is_error=True`. !!! check Ask `lookup_book` for `"Solaris"` (a title that isn't in the catalog) and the function raises `ValueError`. The call still returns normally: ```python result.is_error # True result.content # [TextContent(type='text', text="Error executing tool lookup_book: No book titled 'Solaris' in the catalog.")] result.structured_content # None ``` The exception's message landed in `content`, where the **model** can read it and try again. That is deliberate: a tool error is part of the conversation, not a crash. Always look at `is_error` before you trust `structured_content`. !!! warning `is_error=True` covers more than your own `raise`. Ask for a tool the server doesn't even have (`call_tool("does_not_exist", {})`) and nothing raises. You get the same shape back, `is_error=True` with `Unknown tool: does_not_exist` in `content`. A `Client` method raises `MCPError` only when the server answers with a JSON-RPC **error** instead of a result, and **[Handling errors](https://py.sdk.modelcontextprotocol.io/v2/tutorial/handling-errors/index.md)** covers when a server produces which. ## Resources The resource verbs come in pairs: two ways to list, one way to read. ```python title="client.py" hl_lines="23-32" # docs_src/client/tutorial004.py from mcp_types import TextResourceContents from mcp import Client from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.resource("catalog://genres") def genres() -> list[str]: """The genres the catalog is organised by.""" return ["fiction", "non-fiction", "poetry"] @mcp.resource("catalog://genres/{genre}") def books_in_genre(genre: str) -> str: """Every title we stock in one genre.""" return f"3 books filed under {genre}." async def main() -> None: async with Client(mcp) as client: listed = await client.list_resources() print([resource.uri for resource in listed.resources]) templates = await client.list_resource_templates() print([template.uri_template for template in templates.resource_templates]) result = await client.read_resource("catalog://genres/poetry") for contents in result.contents: if isinstance(contents, TextResourceContents): print(contents.text) ``` * `list_resources()` returns the **concrete** resources, the ones with a fixed URI. Here: `['catalog://genres']`. * `list_resource_templates()` returns the **parameterised** ones. Here: `['catalog://genres/{genre}']`. They are two different lists because a template isn't readable until you fill it in. * `read_resource(uri)` takes a plain `str` URI and works on both: pass `"catalog://genres/poetry"` and the server matches it to the template. `read_resource` returns `contents`, a list of `TextResourceContents` or `BlobResourceContents`. Same idea as tool content: narrow with `isinstance`, then read `.text` (or `.blob`). A client can also **subscribe** to a resource and be told when it changes: `subscribe_resource(uri)` and `unsubscribe_resource(uri)`, same shape as everything else here. `MCPServer` doesn't implement that half. It says so up front (`server_capabilities.resources.subscribe` is `False`) and answers the request with an `MCPError`: `-32601`, *Method not found*. A server that does support subscriptions is built on the low-level `Server` (**[The low-level Server](https://py.sdk.modelcontextprotocol.io/v2/advanced/low-level-server/index.md)**). ## Prompts ```python title="client.py" hl_lines="15-20" # docs_src/client/tutorial005.py from mcp import Client from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.prompt(title="Recommend a book") def recommend(genre: str) -> str: """Ask for a recommendation in a genre.""" return f"Recommend one {genre} book from the catalog and say why." async def main() -> None: async with Client(mcp) as client: listed = await client.list_prompts() print(listed.prompts) result = await client.get_prompt("recommend", {"genre": "poetry"}) for message in result.messages: print(message.role, message.content) ``` `list_prompts()` tells you what the server offers and what each prompt needs: ```python prompt.name # 'recommend' prompt.title # 'Recommend a book' prompt.arguments # [PromptArgument(name='genre', required=True)] ``` `get_prompt(name, arguments)` renders it. The arguments dict is `str -> str`: prompt arguments are always strings. The result is `messages`, a list of `PromptMessage`, each with a `role` and a `content` block: ```python message.role # 'user' message.content # TextContent(type='text', text='Recommend one poetry book from the catalog and say why.') ``` A host hands those messages straight to the model. That is the whole feature. ## Completions A server with a completion handler can autocomplete prompt and resource-template arguments as the user types. ```python title="client.py" hl_lines="28-32" # docs_src/client/tutorial006.py from mcp_types import Completion, CompletionArgument, CompletionContext, PromptReference, ResourceTemplateReference from mcp import Client from mcp.server import MCPServer mcp = MCPServer("Bookshop") GENRES = ["fiction", "non-fiction", "poetry"] @mcp.prompt() def recommend(genre: str) -> str: """Ask for a recommendation in a genre.""" return f"Recommend one {genre} book from the catalog and say why." @mcp.completion() async def complete_genre( ref: PromptReference | ResourceTemplateReference, argument: CompletionArgument, context: CompletionContext | None, ) -> Completion | None: return Completion(values=[genre for genre in GENRES if genre.startswith(argument.value)]) async def main() -> None: async with Client(mcp) as client: result = await client.complete( ref=PromptReference(type="ref/prompt", name="recommend"), argument={"name": "genre", "value": "p"}, ) print(result.completion.values) ``` * `ref` says *which* prompt or template you're filling in: a `PromptReference` or a `ResourceTemplateReference`. * `argument` is `{"name": ..., "value": ...}`: the argument and what the user has typed so far. The answer is in `result.completion.values`. Type `"p"` and the server comes back with `['poetry']`. The server side, and how a handler uses the *other* already-filled arguments to narrow its suggestions, is the **[Completions](https://py.sdk.modelcontextprotocol.io/v2/tutorial/completions/index.md)** chapter. ## Pagination Every `list_*` method takes a `cursor=` keyword and every result carries a `next_cursor`. When `next_cursor` is `None`, you have everything. ```python title="client.py" hl_lines="23-31" # docs_src/client/tutorial007.py from mcp_types import Tool from mcp import Client from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.tool() def search_books(query: str) -> str: """Search the catalog by title or author.""" return f"Found 3 books matching {query!r}." @mcp.tool() def reserve_book(title: str) -> str: """Put a book on hold.""" return f"Reserved {title!r}." async def main() -> None: async with Client(mcp) as client: tools: list[Tool] = [] cursor: str | None = None while True: page = await client.list_tools(cursor=cursor) tools.extend(page.tools) if page.next_cursor is None: break cursor = page.next_cursor print([tool.name for tool in tools]) ``` This loop is correct against every server. `MCPServer` returns everything in one page, so `next_cursor` is `None` and the loop runs once, which is why most code never writes it. Servers that genuinely page, and the rules cursors obey, are in **[Pagination](https://py.sdk.modelcontextprotocol.io/v2/advanced/pagination/index.md)**. ## In tests `Client(mcp)` with no process and no port is already a test harness for your server. There is one constructor flag built for that: `Client(mcp, raise_exceptions=True)`. It only has an effect on in-memory connections, and **[Testing](https://py.sdk.modelcontextprotocol.io/v2/tutorial/testing/index.md)** is the chapter that explains it and builds the whole pattern around it. ## Recap * `Client(x)` connects in-memory to a server object, over Streamable HTTP to a URL string, and over anything else via a transport. * `async with` is the whole lifecycle. Inside it, `server_info`, `server_capabilities`, `protocol_version` and `instructions` are already populated. * `list_tools()` gives you each tool's `name`, `title`, `description` and `input_schema`. * `call_tool()` returns `content` for the model, `structured_content` for your code, and `is_error`. A raising tool is a result, not an exception. * `content` is a union of block types; narrow with `isinstance` before reading. * `list_resources` / `list_resource_templates` / `read_resource`, `list_prompts` / `get_prompt`, and `complete` round out the verbs. * Every `list_*` takes `cursor=`; loop until `next_cursor` is `None`. Next: the things a server can ask the *client* for, and how you answer, in **[Client callbacks](https://py.sdk.modelcontextprotocol.io/v2/client/callbacks/index.md)**. # Client callbacks Source: https://py.sdk.modelcontextprotocol.io/v2/client/callbacks/ So far every request has gone one way: client to server. A server can also ask the **client** for things: to put a question to the user, to sample the user's model, to list the user's workspace folders. You answer those requests by passing **callbacks** to `Client(...)`. ## A server that asks Here is a server whose tool can't finish on its own: ```python title="server.py" hl_lines="16" # docs_src/client_callbacks/tutorial001.py from pydantic import BaseModel from mcp.server import MCPServer from mcp.server.mcpserver import Context mcp = MCPServer("Library") class CardHolder(BaseModel): name: str @mcp.tool() async def issue_card(ctx: Context) -> str: """Issue a new library card.""" answer = await ctx.elicit("What name should go on the card?", schema=CardHolder) if answer.action == "accept": return f"Card issued to {answer.data.name}." return "No card issued." ``` * `ctx.elicit(...)` sends an `elicitation/create` request **to the client** and waits. * The tool doesn't return until somebody (a person in a form, or your code) supplies a `name`. That is the server half, and the **[Elicitation](https://py.sdk.modelcontextprotocol.io/v2/tutorial/elicitation/index.md)** chapter owns it. This chapter is the other end of the wire. ## The elicitation callback ```python title="client.py" hl_lines="7-11 17-18" # docs_src/client_callbacks/tutorial002.py from mcp_types import ElicitRequestParams, ElicitResult from mcp import Client from mcp.client import ClientRequestContext async def handle_elicitation( context: ClientRequestContext, params: ElicitRequestParams, ) -> ElicitResult: return ElicitResult(action="accept", content={"name": "Ada Lovelace"}) async def main() -> None: async with Client( "http://127.0.0.1:8000/mcp", mode="legacy", elicitation_callback=handle_elicitation, ) as client: result = await client.call_tool("issue_card") print(result.content) ``` * An elicitation callback is `async (context, params) -> ElicitResult`. * `params.message` is the question. `params.requested_schema` is the JSON Schema of the answer the server wants. A real client renders a form from it; this one auto-fills. * You return `ElicitResult(action="accept", content={...})`, or `action="decline"`, or `action="cancel"`. The only other option is `ErrorData(...)`, which refuses the request and fails the whole call. * `context` is a `ClientRequestContext`: the live `session`, the server's `request_id`, and any `meta` it attached. !!! tip `params` is a union of the two elicitation modes. Here `params.mode` is `"form"`; a `"url"` request carries `params.url` instead of a schema. One callback handles both; branch on `params.mode`. **[Elicitation](https://py.sdk.modelcontextprotocol.io/v2/tutorial/elicitation/index.md)** shows the full pattern. ### Try it Call `issue_card` and watch both ends. Your callback receives the server's question, already parsed: ```python params.mode # 'form' params.message # 'What name should go on the card?' params.requested_schema # {'properties': {'name': {'title': 'Name', 'type': 'string'}}, # 'required': ['name'], 'title': 'CardHolder', 'type': 'object'} ``` It answers, `ctx.elicit(...)` resumes inside the tool, and the tool finishes: ```python result.content # [TextContent(type='text', text='Card issued to Ada Lovelace.')] ``` One `tools/call` from you, one `elicitation/create` back from the server, answered by your function, all inside a single tool call. !!! info `mode="legacy"` on line 17 is doing real work. By default `Client(...)` negotiates the modern protocol path, and that path has no back-channel for server-to-client requests: `ctx.elicit` fails before your callback ever runs. The transport doesn't decide that; the negotiated protocol does, in-memory and over a URL alike. Pin `mode="legacy"` whenever your client has to answer one; every test behind this page does. **[Protocol versions](https://py.sdk.modelcontextprotocol.io/v2/client/protocol-versions/index.md)** has the whole story. On a 2026-07-28 session the callback isn't dead, it's fed differently: when a tool returns an `InputRequiredResult` carrying an `ElicitRequest`, `Client` dispatches that entry to the same `elicitation_callback` and retries the call for you. That flow is **[Multi-round-trip requests](https://py.sdk.modelcontextprotocol.io/v2/advanced/multi-round-trip/index.md)**. ## A callback is a capability You never told the server that your client can answer elicitation requests. The SDK did. When a client connects it declares its `capabilities`, the mirror image of the server's. You don't write that object. **Registering a callback is the declaration.** | you pass | the client declares | | --- | --- | | `elicitation_callback=` | `"elicitation": {"form": {}, "url": {}}` | | `sampling_callback=` | `"sampling": {}` | | `list_roots_callback=` | `"roots": {"listChanged": true}` | | none of them | `{}` | `logging_callback` and `message_handler` are not in the table. They handle notifications, and notifications need no capability. The server reads the declaration back with `ctx.session.check_client_capability(...)`. Add a tool that does: ```python title="server.py" hl_lines="23-31" # docs_src/client_callbacks/tutorial003.py from mcp_types import ClientCapabilities, ElicitationCapability, RootsCapability, SamplingCapability from pydantic import BaseModel from mcp.server import MCPServer from mcp.server.mcpserver import Context mcp = MCPServer("Library") class CardHolder(BaseModel): name: str @mcp.tool() async def issue_card(ctx: Context) -> str: """Issue a new library card.""" answer = await ctx.elicit("What name should go on the card?", schema=CardHolder) if answer.action == "accept": return f"Card issued to {answer.data.name}." return "No card issued." @mcp.tool() def client_features(ctx: Context) -> list[str]: """Which optional features the connected client declared.""" declared = { "elicitation": ClientCapabilities(elicitation=ElicitationCapability()), "sampling": ClientCapabilities(sampling=SamplingCapability()), "roots": ClientCapabilities(roots=RootsCapability()), } return [name for name, capability in declared.items() if ctx.session.check_client_capability(capability)] ``` Connect with only `elicitation_callback` and call it: ```python result.structured_content # {'result': ['elicitation']} ``` Pass all three callbacks and you get `['elicitation', 'sampling', 'roots']`. Pass none and you get `[]`. !!! check Now do the wrong thing: connect **without** `elicitation_callback` and call `issue_card` anyway. The server's `elicitation/create` request still reaches your client, and the SDK answers it for you, with an error, because you never said you could handle it. That error sinks the whole call. `call_tool` doesn't return an `is_error` result; it raises: ```text MCPError: Elicitation not supported ``` That is a protocol error (`-32600`, *invalid request*), not a tool error: there is nothing for the model to read and retry. It's why `client_features` is worth having: a well-behaved server checks before it asks. ## The deprecated pair `sampling_callback` answers `sampling/createMessage`: the server asking *your* model to complete something. `list_roots_callback` answers `roots/list`: the server asking which directories it may work in. Both work. Both follow the rule above. And both serve RPCs the **2026-07-28 spec removes**: a modern server doesn't call back into your client mid-request, it hands the request back to you as part of the tool result (**[Multi-round-trip requests](https://py.sdk.modelcontextprotocol.io/v2/advanced/multi-round-trip/index.md)**). The callbacks themselves are not dead. When an `InputRequiredResult` carries a `CreateMessageRequest` or a `ListRootsRequest`, `Client`'s auto-loop dispatches it to the same `sampling_callback` or `list_roots_callback` you registered here. The whole list is in **[Deprecated features](https://py.sdk.modelcontextprotocol.io/v2/advanced/deprecated/index.md)**. You still need the callbacks to talk to servers that haven't moved. The signatures: ```python title="client.py" # docs_src/client_callbacks/tutorial004.py from mcp_types import CreateMessageRequestParams, CreateMessageResult, ListRootsResult, Root, TextContent from pydantic import FileUrl from mcp.client import ClientRequestContext async def handle_sampling( context: ClientRequestContext, params: CreateMessageRequestParams, ) -> CreateMessageResult: return CreateMessageResult( role="assistant", content=TextContent(type="text", text="The answer is 42."), model="my-llm", ) async def handle_list_roots(context: ClientRequestContext) -> ListRootsResult: return ListRootsResult(roots=[Root(uri=FileUrl("file:///home/ada/notebooks"), name="notebooks")]) ``` * A sampling callback receives the full `CreateMessageRequestParams` (`messages`, `model_preferences`, `max_tokens`) and returns a `CreateMessageResult`. *You* run the model, however you like; the SDK only carries the request. * A roots callback takes no params at all and returns a `ListRootsResult`. * Either one may return `ErrorData(...)` instead, to refuse. Pass them to `Client(...)` exactly like `elicitation_callback`. ## The notification callbacks Two more. Neither declares anything. `logging_callback` receives every `notifications/message` a server sends, as `LoggingMessageNotificationParams` (`level`, `logger`, `data`). Protocol logging is itself deprecated by the 2026-07-28 spec (**[Logging](https://py.sdk.modelcontextprotocol.io/v2/tutorial/logging/index.md)** has what to do instead), so this callback exists for the servers that still emit it. `message_handler` is the catch-all: every server notification reaches it (as well as its specific callback), and on a stream-backed transport so does every transport-level `Exception`. The one pattern worth knowing is `if isinstance(message, Exception): raise message`, so a broken connection fails loudly instead of vanishing. ## Recap * A server can send requests to the client. You answer them with callbacks passed to `Client(...)`. * The elicitation callback is the current one: `async (context, params) -> ElicitResult`, one function for both form and URL mode. * **Registering a callback is declaring the capability.** Without it, the SDK refuses the server's request on your behalf and the whole call fails with `MCPError`. * A server finds out before asking with `ctx.session.check_client_capability(...)`. * `sampling_callback` and `list_roots_callback` work the same way but serve deprecated features; modern servers use multi-round-trip requests instead. * `logging_callback` and `message_handler` receive notifications. They declare nothing. Next: the first argument you've been passing to `Client(...)` all along, **[Client transports](https://py.sdk.modelcontextprotocol.io/v2/client/transports/index.md)**. # Client transports Source: https://py.sdk.modelcontextprotocol.io/v2/client/transports/ Every `Client` talks to its server over a **transport**: the thing that actually carries the messages. You never configure one separately. `Client` takes a single positional argument and works the transport out from its type. The *server* side of each (what `mcp.run()` does and what you deploy) is **[Running your server](https://py.sdk.modelcontextprotocol.io/v2/run/index.md)**. ## In memory Pass the server object itself: ```python title="client.py" hl_lines="14" # docs_src/client_transports/tutorial001.py from mcp import Client from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.tool() def search_books(query: str) -> str: """Search the catalog by title or author.""" return f"Found 3 books matching {query!r}." async def main() -> None: async with Client(mcp) as client: result = await client.call_tool("search_books", {"query": "dune"}) print(result.structured_content) ``` No subprocess, no port, no bytes on a wire. The client and the server are two objects in the same process, and the call still goes through the real protocol layer: `search_books` is listed, validated and invoked exactly as it would be over HTTP. That makes it two things at once: * **A test harness.** Every example in this documentation is exercised this way, and the **[Testing](https://py.sdk.modelcontextprotocol.io/v2/tutorial/testing/index.md)** chapter builds the whole pattern around it. * **An embedding API.** An application that constructs the server doesn't need a network hop to call its tools. ## Streamable HTTP Pass a URL string and you get **Streamable HTTP**, the transport you deploy behind: ```python title="client.py" hl_lines="5" # docs_src/client_transports/tutorial002.py from mcp import Client async def main() -> None: async with Client("http://localhost:8000/mcp") as client: result = await client.list_tools() print([tool.name for tool in result.tools]) ``` That is the whole production client. `Client` wraps the URL in `streamable_http_client(...)` for you, on top of an `httpx.AsyncClient` configured the way MCP needs: `follow_redirects=True`, a 30-second timeout for connect/write/pool, and a 300-second read timeout because the server may hold a response stream open. !!! check A `Client` you have constructed is **not** connected. Construction only picks the transport; `async with` is what opens it. Reach for the connection before entering and the SDK tells you so: ```text RuntimeError: Client must be used within an async context manager ``` Nothing was resolved, fetched or spawned when you wrote `Client("http://...")`. That line is free. ### Bring your own `httpx.AsyncClient` The moment you need an `Authorization` header, a cookie, a proxy, mTLS, or a different timeout, build the `httpx.AsyncClient` yourself and hand it to `streamable_http_client`: ```python title="client.py" hl_lines="8-14" # docs_src/client_transports/tutorial003.py import httpx from mcp import Client from mcp.client.streamable_http import streamable_http_client async def main() -> None: async with httpx.AsyncClient( headers={"Authorization": "Bearer ..."}, timeout=httpx.Timeout(30.0, read=300.0), follow_redirects=True, ) as http_client: transport = streamable_http_client("http://localhost:8000/mcp", http_client=http_client) async with Client(transport) as client: result = await client.list_tools() print([tool.name for tool in result.tools]) ``` Two things to notice: * You own the `httpx.AsyncClient`, so **you** enter and exit it. The SDK never closes a client it didn't create. * `streamable_http_client(url, http_client=...)` returns a transport, and `Client(transport)` accepts it like anything else. !!! warning `streamable_http_client` used to take `headers=` and `timeout=` directly. It does not any more: its only parameters are `url`, `http_client` and `terminate_on_close`. Reach for `headers=` out of habit and you get: ```text TypeError: streamable_http_client() got an unexpected keyword argument 'headers' ``` Everything HTTP-shaped now lives on the one `httpx.AsyncClient` you pass in. !!! info If you know `httpx`, you already know how to do auth, proxies, event hooks, retries and connection limits here. The SDK adds nothing on top and takes nothing away. It is also where OAuth plugs in: `httpx.AsyncClient(auth=OAuthClientProvider(...))`. That whole flow is **[OAuth clients](https://py.sdk.modelcontextprotocol.io/v2/advanced/oauth-clients/index.md)**. ## stdio A **stdio** server is a subprocess. The client launches it, writes JSON-RPC to its stdin and reads JSON-RPC from its stdout. It is how a desktop host runs a server on your machine. Describe the process with `StdioServerParameters`, turn it into a transport with `stdio_client`, and hand *that* to `Client`: ```python title="client.py" hl_lines="4-8 12" # docs_src/client_transports/tutorial004.py from mcp import Client, StdioServerParameters from mcp.client.stdio import stdio_client server = StdioServerParameters( command="uv", args=["run", "server.py"], env={"BOOKSHOP_API_KEY": "secret"}, ) async def main() -> None: async with Client(stdio_client(server)) as client: result = await client.list_tools() print([tool.name for tool in result.tools]) ``` `Client` does not accept the parameters object on its own. `StdioServerParameters` is configuration; `stdio_client(server)` is the transport that knows how to spawn a process from it. Always wrap. Leaving the `async with` block also shuts the subprocess down: close stdin, wait, kill if it lingers. You never clean it up yourself. !!! warning The child does **not** inherit your environment. It gets a minimal allow-list (`HOME`, `LOGNAME`, `PATH`, `SHELL`, `TERM` and `USER` on POSIX) so nothing sensitive leaks into a process you may not have written. A server that needs an API key won't find it there. Pass it explicitly with `env=`; those variables are merged on top of the allow-list. That is what `BOOKSHOP_API_KEY` is doing above. ## SSE `sse_client(url)`, from `mcp.client.sse`, is the HTTP transport that Streamable HTTP superseded. Wrap it the same way, `Client(sse_client("http://localhost:8000/sse"))`, to talk to a server that still speaks it, and don't build anything new on it. ## The `Transport` protocol To `Client`, all of the above are the same thing. A **transport** is any async context manager that yields a `(read, write)` pair of message streams: formally, the `Transport` protocol in `mcp.client`. `Client` resolves its argument by type: a server object connects in-process, a `str` becomes `streamable_http_client(url)`, and anything else is entered as a transport directly. That last rule is why `stdio_client(...)`, `streamable_http_client(...)` and `sse_client(...)` all drop into the same slot, and why you can write your own. ## Recap * `Client(mcp)` (the server object) connects in memory. Use it for tests and for embedding. * `Client("http://.../mcp")` (a URL) connects over Streamable HTTP, the production transport. * Headers, auth, proxies and timeouts belong on an `httpx.AsyncClient` you pass to `streamable_http_client(url, http_client=...)`. There is no `headers=` keyword. * stdio is `Client(stdio_client(StdioServerParameters(...)))`, never the parameters object alone. * The subprocess gets an allow-listed environment, not yours; `env=` adds to it. * A transport is anything you can `async with x as (read, write)`. `Client` hands anything that isn't a server object or a URL straight to that protocol. * Constructing a `Client` picks the transport. `async with` opens it. Once the transport is open the two sides have to agree on a protocol version. You normally never think about it; when you do, **[Protocol versions](https://py.sdk.modelcontextprotocol.io/v2/client/protocol-versions/index.md)** is the page. # Protocol versions Source: https://py.sdk.modelcontextprotocol.io/v2/client/protocol-versions/ MCP has two eras. Servers released before 2026-07-28 open every connection with the **`initialize` handshake**: the client proposes a version, the server counters, the client acknowledges, all before the first useful request. Servers at **2026-07-28** drop the handshake. The client sends one **`server/discover`** probe and the server answers it with everything in a single result. You haven't had to care, because `Client` negotiates for you. This chapter is about the one constructor argument that controls it, `mode=`, and the three times you change it. ## `mode="auto"` ```python title="client.py" hl_lines="14-15" # docs_src/protocol_versions/tutorial001.py from mcp import Client from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.tool() def search_books(query: str) -> str: """Search the catalog by title or author.""" return f"Found 3 books matching {query!r}." async def main() -> None: async with Client(mcp) as client: print(client.protocol_version) ``` You didn't pass `mode`, so you got the default: `"auto"`. Entering `async with` sends a single `server/discover` probe at the newest version this SDK speaks. Then: * A **modern server** answers it. The client adopts the result. One round trip, done. * An **older server** has never heard of `server/discover` and returns an error. The client falls back to the classic `initialize` handshake and takes whatever that negotiates. Either way you come out connected, and `client.protocol_version` tells you which it was: ```text 2026-07-28 ``` That is the whole feature. One `Client`, any era of server, no branching in your code. !!! info `MCPServer` answers `server/discover`, so against your own in-memory server `auto` always lands on `2026-07-28`. The fallback only ever fires against a real pre-2026 server, which is exactly when you want it to. ## `mode="legacy"` ```python title="client.py" hl_lines="14" # docs_src/protocol_versions/tutorial002.py from mcp import Client from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.tool() def search_books(query: str) -> str: """Search the catalog by title or author.""" return f"Found 3 books matching {query!r}." async def main() -> None: async with Client(mcp, mode="legacy") as client: print(client.protocol_version) ``` `mode="legacy"` never probes. It runs the `initialize` handshake, the same connection a pre-2026 client opens. ```text 2025-11-25 ``` Same server. It speaks `2026-07-28` perfectly well; you told the client not to ask. You want this for the **push-style** features. A server-initiated request is the server calling *you*: `ctx.elicit(...)` putting a form in front of your user, sampling asking your model for a completion mid-tool-call. That channel only exists on a handshake-era session. At 2026-07-28 it is gone. The server *returns* its questions and you retry the call with the answers (**[Multi-round-trip requests](https://py.sdk.modelcontextprotocol.io/v2/advanced/multi-round-trip/index.md)**). `mode="auto"` only gives you a handshake when the server is too old for anything else. `mode="legacy"` guarantees one. Reach for it whenever you hand `Client(...)` a `sampling_callback`, an `elicitation_callback` you want driven as a request, or a `message_handler`. **[Client callbacks](https://py.sdk.modelcontextprotocol.io/v2/client/callbacks/index.md)** goes through each. ## Pinning a version `mode` also accepts a modern protocol version string. Today that set is exactly `["2026-07-28"]`. ```python title="client.py" hl_lines="14" # docs_src/protocol_versions/tutorial003.py from mcp import Client from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.tool() def search_books(query: str) -> str: """Search the catalog by title or author.""" return f"Found 3 books matching {query!r}." async def main() -> None: async with Client(mcp, mode="2026-07-28") as client: print(client.protocol_version) ``` A pin sends **nothing**. No probe, no handshake. The client adopts `2026-07-28` locally and the connection is live the instant `async with` returns. A pin is a promise *you* make: you already know the server speaks that version. The client doesn't check. !!! check A pin is not a discovery. Print `client.server_info` and the price is right there: ```text name='' title=None version='' description=None website_url=None icons=None ``` The client never asked the server who it is, so `server_info` is a blank. `client.server_capabilities` is the same story: every capability is `None`. Tool calls still work (the protocol needs none of it); code that reads `server_capabilities` to decide what to offer does not. The next section is the fix. Only modern versions are pinnable. A handshake-era string is rejected at construction, before any I/O, and the error tells you what to write instead: ```text ValueError: mode must be 'legacy', 'auto', or one of ['2026-07-28']; got '2025-06-18' ('2025-06-18' is a handshake-era version; use mode='legacy') ``` ## Reconnecting with `prior_discover` The probe is cheap, but it is still a round trip you pay on every reconnect, and the answer almost never changes. So keep it. After an `auto` connection, `client.session.discover_result` holds the exact `DiscoverResult` the server sent: its `supported_versions`, its `capabilities`, its `server_info`, its `instructions`. Hand it back as `prior_discover=` the next time: ```python title="client.py" hl_lines="15 17" # docs_src/protocol_versions/tutorial004.py from mcp import Client from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.tool() def search_books(query: str) -> str: """Search the catalog by title or author.""" return f"Found 3 books matching {query!r}." async def main() -> None: async with Client(mcp) as client: saved = client.session.discover_result async with Client(mcp, mode="2026-07-28", prior_discover=saved) as client: print(client.protocol_version) print(client.server_info.name) ``` ```text 2026-07-28 Bookshop ``` The second connection made **zero** negotiation round trips and still knows exactly who it is talking to. That is the pinned mode done properly: `mode=` names the version, `prior_discover=` supplies the identity. ✨ `DiscoverResult` is a Pydantic model. `saved.model_dump_json()` goes into a file or a cache; `DiscoverResult.model_validate_json(...)` brings it back in the next process. !!! tip `prior_discover=` only does anything when `mode` is a version pin. Under `"auto"` the client probes the server anyway, and under `"legacy"` it is ignored. ## The four modes | You write | Negotiation traffic | You get | | --- | --- | --- | | `Client(target)` | one `server/discover` probe; the `initialize` handshake if it fails | the newest version both sides speak, whichever era | | `Client(target, mode="legacy")` | the `initialize` handshake | a handshake-era version; server-initiated requests work | | `Client(target, mode="2026-07-28")` | none | that version, pinned, with a blank `server_info` | | `Client(target, mode="2026-07-28", prior_discover=saved)` | none | that version, pinned, *and* the identity you saved last time | ## Recap * MCP has a handshake era (up to `2025-11-25`, the `initialize` handshake) and a modern era (`2026-07-28`, `server/discover`). `Client` bridges them. * `mode="auto"` is the default: probe, fall back. Leave it alone unless one of the other three rows describes you. * `client.protocol_version` is always the answer to "what did I get?". * `mode="legacy"` forces the handshake. It is what you need for server-initiated requests: sampling, push elicitation, `message_handler`. * A version pin (`mode="2026-07-28"`) sends no negotiation traffic at all, at the cost of a blank `server_info`. * `prior_discover=` pays that cost back: save `client.session.discover_result`, reconnect with it, get both. A modern connection has no push channel, so how does a 2026 server ask you a question mid-call? It returns it: **[Multi-round-trip requests](https://py.sdk.modelcontextprotocol.io/v2/advanced/multi-round-trip/index.md)**. # Multi-round-trip requests Source: https://py.sdk.modelcontextprotocol.io/v2/advanced/multi-round-trip/ Sometimes a tool can't finish in one round trip. It needs something only the user has: a choice, a confirmation, a credential. Before 2026-07-28 the server got it by calling **back**: opening its own request to the client (an elicitation, a sampling call) in the middle of handling the original one. The 2026-07-28 spec retires that back-channel. Instead, the server **returns**. ## Return, don't call back The server answers `tools/call` with an **`InputRequiredResult`** instead of a `CallToolResult`. Two of its fields do the work: * **`input_requests`**: what the server still needs, as a dict keyed by names the server chose. Each value is an `ElicitRequest`, a `CreateMessageRequest`, or a `ListRootsRequest`. * **`request_state`**: an opaque token. The client echoes it back verbatim on the retry. Your server is the only thing that reads it. The client fulfils each request, then calls the **same tool again**, carrying its answers in `input_responses` and the token in `request_state`. The server now has what it was missing and returns a normal `CallToolResult`. That's the whole protocol. Every leg is an ordinary request from the client to the server. Nothing ever flows the other way. ## The server side On `@mcp.tool()` you rarely build this by hand: declare a dependency that asks the user and the SDK returns the `InputRequiredResult` for you - that form is the **[Dependencies](https://py.sdk.modelcontextprotocol.io/v2/tutorial/dependencies/index.md)** tutorial. The two forms don't mix: a call has one `input_responses`/`request_state` channel, so a tool that uses `Resolve(...)` parameters cannot also return `InputRequiredResult` from its body. A declared `InputRequiredResult` return is rejected at registration (`InvalidSignature`), and an undeclared one fails the call at runtime. The manual form is the **low-level** `Server`, whose `on_call_tool` handler is allowed to return either result type: ```python title="server.py" hl_lines="44-47" # docs_src/mrtr/tutorial001.py from mcp_types import ( CallToolRequestParams, CallToolResult, ElicitRequest, ElicitRequestFormParams, ElicitResult, InputRequiredResult, ListToolsResult, PaginatedRequestParams, TextContent, Tool, ) from mcp.server import Server, ServerRequestContext ASK_REGION = ElicitRequest( params=ElicitRequestFormParams( message="Which region should the database live in?", requested_schema={ "type": "object", "properties": {"region": {"type": "string"}}, "required": ["region"], }, ) ) async def list_tools(ctx: ServerRequestContext, params: PaginatedRequestParams | None) -> ListToolsResult: return ListToolsResult( tools=[ Tool( name="provision", description="Provision a database. Asks which region to put it in.", input_schema={ "type": "object", "properties": {"name": {"type": "string"}}, "required": ["name"], }, ) ] ) async def call_tool(ctx: ServerRequestContext, params: CallToolRequestParams) -> CallToolResult | InputRequiredResult: answer = (params.input_responses or {}).get("region") if not isinstance(answer, ElicitResult) or answer.content is None: return InputRequiredResult(input_requests={"region": ASK_REGION}, request_state="provision-v1") name = (params.arguments or {})["name"] text = f"Provisioned {name!r} in {answer.content['region']}." return CallToolResult(content=[TextContent(type="text", text=text)]) server = Server("Provisioner", on_list_tools=list_tools, on_call_tool=call_tool) ``` * `on_call_tool` is typed `-> CallToolResult | InputRequiredResult`. Returning the second one is the entire server-side API. * On the first call `params.input_responses` is `None`, so the guard fires and the handler asks instead of answering. * On the retry, the `ElicitResult` the client sent is sitting under the **same key** (`"region"`) that the server used in `input_requests`. Everything else in that file (the explicit `input_schema`, the hand-built `CallToolResult`) is the ordinary low-level `Server`, covered in **[The low-level Server](https://py.sdk.modelcontextprotocol.io/v2/advanced/low-level-server/index.md)**. This page only adds the second return type. ## Beyond tools `tools/call` is not special: at 2026-07-28 a server may answer `prompts/get` and `resources/read` the same way. On `MCPServer`, an `@mcp.prompt()` function — or an `@mcp.resource()` **template** function — returns the `InputRequiredResult` itself and reads the retry's answers off the context: ```python title="server.py" hl_lines="21 23 25" # docs_src/mrtr/tutorial004.py from mcp_types import ElicitRequest, ElicitRequestFormParams, ElicitResult, InputRequiredResult from mcp.server.mcpserver import Context, MCPServer from mcp.server.mcpserver.prompts.base import UserMessage mcp = MCPServer("Briefing") ASK_AUDIENCE = ElicitRequest( params=ElicitRequestFormParams( message="Who is the briefing for?", requested_schema={ "type": "object", "properties": {"audience": {"type": "string"}}, "required": ["audience"], }, ) ) @mcp.prompt() async def briefing(ctx: Context) -> list[UserMessage] | InputRequiredResult: """Draft a briefing tuned to its audience.""" answer = (ctx.input_responses or {}).get("audience") if not isinstance(answer, ElicitResult) or answer.content is None: return InputRequiredResult(input_requests={"audience": ASK_AUDIENCE}) return [UserMessage(f"Write a briefing for {answer.content['audience']}.")] ``` * The first round returns the `InputRequiredResult`. On the retry, `ctx.input_responses` holds the answers under the same keys and the function returns its ordinary result — prompt messages here, resource content for a template resource. * A `request_state` you set is sealed before it crosses the wire and verified on the echo, like everything else on the server; **[Protecting `requestState`](#protecting-requeststate)** below covers what the seal gives you and when you need to configure keys. * An `@mcp.tool()` function can return the result directly the same way, when the dependency form doesn't fit. * Static `@mcp.resource()` functions don't participate: they take no `Context`, so they could never read the retry. Only template resources can ask. * The era rules below apply unchanged: returning an `InputRequiredResult` on a pre-2026 session is the same `-32603` the warning describes. ## The client side `Client` runs the loop for you. Register the callbacks the server might ask for (`elicitation_callback`, `sampling_callback`, `list_roots_callback`) and call the tool. When an `InputRequiredResult` arrives, `Client` dispatches each entry in `input_requests` to the matching callback, retries with the answers and the echoed `request_state`, and keeps going until a `CallToolResult` comes back: ```python title="client.py" hl_lines="12 13" # docs_src/mrtr/tutorial003.py from mcp_types import ElicitRequestParams, ElicitResult from mcp import Client from mcp.client import ClientRequestContext async def handle_elicitation(context: ClientRequestContext, params: ElicitRequestParams) -> ElicitResult: return ElicitResult(action="accept", content={"region": "eu-west-1"}) async def main() -> None: async with Client("http://127.0.0.1:8000/mcp", elicitation_callback=handle_elicitation) as client: result = await client.call_tool("provision", {"name": "orders"}) print(result.content) ``` * That `elicitation_callback` is the same one a pre-2026 server's back-channel `elicitation/create` would have hit. The same is true of `sampling_callback` for `sampling/createMessage` and `list_roots_callback` for `roots/list`: at 2026-07-28 the standalone server->client RPCs are gone, but the identical `ElicitRequest` / `CreateMessageRequest` / `ListRootsRequest` payloads ride inside `input_requests` and dispatch to the same three callbacks. One set of callbacks serves both eras. * `call_tool` returns a plain `CallToolResult`. The intermediate rounds are invisible to the caller. * `get_prompt` and `read_resource` drive the same loop. !!! check Leave the callback off and the loop fails on the first round: the SDK's stand-in callback answers every elicitation with an error, and `call_tool` raises `MCPError` with the message *"Elicitation not supported"*. The loop is bounded. `Client(..., input_required_max_rounds=10)` is the default cap; a server that keeps returning `InputRequiredResult` past it makes `call_tool` raise. If a round carries only `request_state` and no `input_requests`, `Client` sleeps briefly (50ms doubling to a 250ms ceiling) before retrying, so a server that is just saying *"not done yet"* isn't busy-polled. ### Driving the loop yourself The auto-loop is enough for a single-process client. Own the loop instead when: * Your client is **distributed**: the process that renders the question to the user is not the process that called `call_tool`, so a different worker issues the retry. `request_state` is the persistable token you carry across that boundary, through your own storage, and `input_responses` is what the other side sends back with it. * You want to **inspect** each round: log or audit every `input_requests` entry, refuse certain request kinds, or apply your own backoff between legs. * You want a **wall-clock** bound rather than a round-count bound: wrap your own loop in `anyio.fail_after(...)` instead of relying on `input_required_max_rounds`. Drop to the underlying session, where `allow_input_required=True` hands you the union directly: ```python title="client.py" hl_lines="13 14 20" # docs_src/mrtr/tutorial002.py from mcp_types import CallToolResult, ElicitRequest, ElicitResult, InputRequest, InputRequiredResult, InputResponse from mcp import Client def fulfil(request: InputRequest) -> InputResponse: if not isinstance(request, ElicitRequest): raise NotImplementedError(f"this client cannot answer a {request.method!r} request") return ElicitResult(action="accept", content={"region": "eu-west-1"}) async def provision(client: Client, name: str) -> CallToolResult: result = await client.session.call_tool("provision", {"name": name}, allow_input_required=True) while isinstance(result, InputRequiredResult): responses = {key: fulfil(request) for key, request in (result.input_requests or {}).items()} result = await client.session.call_tool( "provision", {"name": name}, input_responses=responses, request_state=result.request_state, allow_input_required=True, ) return result ``` * `client.session.call_tool(..., allow_input_required=True)` widens the return type to `CallToolResult | InputRequiredResult`. The `isinstance` is what narrows it back. * `request_state` is now in your hands. Write it down between legs and the conversation can resume from a fresh process. * For every entry in `input_requests` you put an `InputResponse` under the **same key** in `input_responses`. `fulfil` is where your UI goes; this one hard-codes the answer. * Same tool name, same `arguments`, every leg. The retry is the original call carried out again, not a new method. ## Protecting `requestState` Everything above treats `request_state` as an echo, and on the wire that is all it is. But the client holds it between legs (writing it down across processes is exactly what the previous section blessed), so what comes back is **client-supplied input**: it can be modified, expired, or lifted from a different call entirely. The spec requires servers to integrity-protect this state and reject the round when verification fails, whenever the state can influence authorization, resource access, or business logic. `MCPServer` protects it by default. Every server seals outgoing `requestState` and verifies every echo — resolver state and hand-built state alike — under a key generated at process start. You configure nothing, write plaintext, and read plaintext; the wire only ever carries an opaque encrypted token. The default key lives and dies with the process, which is the one thing you must know before deploying beyond a single process: ```python from mcp.server.mcpserver import MCPServer, RequestStateSecurity # Multi-instance or restart-surviving: one or more shared secret keys (>= 32 bytes each). mcp = MCPServer("fleet", request_state_security=RequestStateSecurity(keys=[key])) ``` * **The default (no configuration)** suits a single process: stdio, or exactly one HTTP worker. A retry that lands on a different worker, a different instance behind a load balancer, or the same server after a restart is sealed under a key that process doesn't have — the client gets the frozen rejection below and must start the flow over. * **`keys=[...]`** is required whenever a retry can reach a **different instance** (multi-worker `uvicorn`, load-balanced HTTP) or must survive restarts: every instance verifies what any sibling minted. Same machinery, your secret instead of a generated one. * For your own crypto, such as a KMS or an existing token service, pass `RequestStateSecurity(codec=...)` instead of `keys`; **[Bring your own crypto](#bring-your-own-crypto)** below covers the contract. ### What the seal carries Default or configured, `requestState` on the wire is an encrypted, authenticated token. Your code never sees it: handlers and resolvers write plaintext and read plaintext (`ctx.request_state`); the SDK seals on the way out and verifies on the way in. Beyond integrity, each token is bound to: * **A time window.** Every round re-seals with a fresh expiry, so `RequestStateSecurity(ttl=...)` (default 600 seconds) bounds per-round think time, not the whole flow. * **The authenticated principal.** When the request carries an OAuth access token the SDK validated, the state is bound to the token's client, issuer, and subject: state minted for one user fails under another, even when both users share one OAuth client. A verifier that supplies no subject degrades the binding to the client identity alone, which under URL-based client IDs is shared by every user of that client software. When auth is terminated outside the SDK (a fronting proxy), or the transport is unauthenticated, there is no principal to bind and this check is inert, unless `RequestStateSecurity(bind_principal=...)` supplies one from your own identity signal. Whichever components your token verifier supplies, it must supply them consistently: a verifier that includes the subject on some requests and omits it on others changes the principal mid-flow, and in-flight rounds are rejected. * **The originating request.** The method, the tool or prompt name (or resource URI), and a digest of the arguments. A token replayed against a different tool, different arguments, or a different method fails. * **The exact question asked.** Every resolver answer is pinned to the rendered question the client was shown, both on the round it first arrives and when a recorded answer is reused later. Redeploy with a reworded message or a changed schema and the server re-asks instead of consuming a stale answer. The same pinning cuts the other way: derive messages from the tool's arguments, not from per-call data. A message built from a timestamp or a live rate renders differently every round, so every recorded answer looks stale and the server re-asks until the client's round limit ends the call. All of that is the SDK's job, not yours, and not the codec's if you bring your own. ### Rotating keys `keys[0]` seals new state; every key in the list verifies. Zero-downtime rotation is three phases, each fully rolled out before the next: ```python RequestStateSecurity(keys=[OLD, NEW]) # 1: every instance learns to verify NEW; OLD still mints RequestStateSecurity(keys=[NEW, OLD]) # 2: NEW mints; in-flight OLD state keeps verifying RequestStateSecurity(keys=[NEW]) # 3: one ttl after phase 2 is fully out, retire OLD ``` Never promote the minter first: minting under a key some instance can't yet verify drops in-flight rounds mid-rollout. Keys are scoped to one service. The sealed envelope also carries the server's name as an audience claim, so a token minted by a different service that happens to share a secret is rejected anyway. The claim is only as distinctive as the name, so a server given an explicit policy must have a real name or set `RequestStateSecurity(audience=...)` — an unnamed one raises at construction. `audience=` also serves deliberate multi-service topologies where one service must accept state another minted. (The no-configuration default is exempt: its key never leaves the process, so the audience claim has nothing to add.) ### Bring your own crypto `RequestStateSecurity(codec=...)` takes anything with `seal(bytes) -> str` and `unseal(str) -> bytes` that raises `InvalidRequestState` for any token it did not mint. The classic shape is envelope encryption against a KMS, where you unwrap a data key once at startup and keep the per-token crypto local: ```python title="server.py" hl_lines="12 26-27 34-35 38" # docs_src/mrtr/tutorial005.py import os from cryptography.exceptions import InvalidTag from cryptography.hazmat.primitives.ciphers.aead import AESGCM from mcp.server import MCPServer from mcp.server.mcpserver import InvalidRequestState, RequestStateSecurity PREFIX = "kms1." # format version; fed to GCM as associated data, so it is bound under the tag def unwrap_data_key() -> bytes: """One KMS call at process start, kms.decrypt(CiphertextBlob=...); every token after that is local crypto.""" return os.urandom(32) # stand-in for the unwrapped 32-byte data key class EnvelopeCodec: def __init__(self, data_key: bytes) -> None: self._aesgcm = AESGCM(data_key) def seal(self, payload: bytes) -> str: nonce = os.urandom(12) return PREFIX + (nonce + self._aesgcm.encrypt(nonce, payload, PREFIX.encode())).hex() def unseal(self, token: str) -> bytes: if not token.startswith(PREFIX): raise InvalidRequestState("unknown token format") body = token[len(PREFIX) :] try: raw = bytes.fromhex(body) if raw.hex() != body: # only the exact string seal() produced verifies raise ValueError("non-canonical hex") return self._aesgcm.decrypt(raw[:12], raw[12:], PREFIX.encode()) except (ValueError, InvalidTag) as exc: raise InvalidRequestState("token failed verification") from exc mcp = MCPServer("Deployer", request_state_security=RequestStateSecurity(codec=EnvelopeCodec(unwrap_data_key()))) ``` TTL, principal binding, and request binding are **not** the codec's job: the SDK stamps them into the payload before `seal` and re-verifies them after `unseal`, for every codec. A codec's only obligations are integrity (tampered means raise) and, ideally, confidentiality. ### When verification fails Every inbound failure, whether tampered, expired, replayed against a different request or principal, or sealed under a key this server doesn't know, gets the same answer: ```json {"code": -32602, "message": "Invalid or expired requestState"} ``` One frozen message for every cause, so the wire never reveals which check failed; the real reason goes to the server log. Every inbound `requestState` on `tools/call`, `prompts/get`, and `resources/read` is checked, including one arriving for a handler that never mints state. The most common rejection in practice isn't an attacker — it's the default process-local key meeting a retry from before a restart or from another instance; the client restarts the flow, and `keys=[...]` is the fix when that matters. ### Hand-built state A `request_state` you set yourself (returning `InputRequiredResult` from a tool, prompt, or resource-template function) is sealed and verified by the same machinery as resolver state, with zero code changes: write plaintext, read plaintext, and every binding above applies. The one thing the SDK cannot pin for you, even when configured, is question identity: it doesn't know which of *your* questions an answer in your state belongs to. If you store answers keyed by question, include your own question identifier in the state and check it on the retry. The low-level `Server` is the no-batteries tier: unlike `MCPServer`, nothing is sealed until you append the boundary yourself, and your `request_state` crosses the wire exactly as written until you do. The one-line opt-in is shown in **[The low-level Server](https://py.sdk.modelcontextprotocol.io/v2/advanced/low-level-server/index.md#the-other-handlers)**. ## A 2026-07-28 result `InputRequiredResult` only exists at protocol version **2026-07-28**. The in-memory `Client(server)` negotiates it for you; over the wire, `mode="auto"` discovers it. After connecting, `client.protocol_version` tells you what you got. !!! warning A pre-2026 session has nowhere to put an `InputRequiredResult`. Return one from your handler on a `mode="legacy"` connection and the runner cannot serialize it into the negotiated version; the client gets back a `-32603` *"Handler returned an invalid result"* error. A server that serves both eras must check `ctx.protocol_version` before reaching for it. !!! info **URL-mode elicitation** rides this exact mechanism on a 2026 connection. The entry in `input_requests` is an `ElicitRequest` whose params are `ElicitRequestURLParams`; the user finishes the out-of-band flow and your client retries the call. Same loop, no new API. The high-level server half is in **[Elicitation](https://py.sdk.modelcontextprotocol.io/v2/tutorial/elicitation/index.md)**. ## Recap * At 2026-07-28 a server that needs input mid-call **returns** an `InputRequiredResult`. It never opens a request to the client. * `input_requests` is what it needs. `request_state` is an opaque resume token only the server reads. * `Client` runs the retry loop for you: register `elicitation_callback` / `sampling_callback` / `list_roots_callback` and `call_tool` returns a plain `CallToolResult`. `input_required_max_rounds` (default 10) bounds it. * To inspect or persist rounds, use `client.session.call_tool(..., allow_input_required=True)` and own the `while isinstance(result, InputRequiredResult)` loop yourself. * On `@mcp.tool()`, a dependency that asks the user produces this result for you (**[Dependencies](https://py.sdk.modelcontextprotocol.io/v2/tutorial/dependencies/index.md)**); the **low-level** `Server` is the manual form. * Prompts and resources participate too: an `@mcp.prompt()` or template `@mcp.resource()` function returns the `InputRequiredResult` itself and reads `ctx.input_responses` on the retry. * `requestState` comes back as client-supplied input, so `MCPServer` seals it by default — resolver state and hand-built state alike — under a process-local key; multi-instance deployments pass `RequestStateSecurity(keys=[...])` (or a custom codec) so every instance can verify what a sibling minted. The seal binds every token to a time window, the originating request, and the authenticated principal when the request carries auth the SDK validated or `bind_principal=` supplies your own identity signal (**[Protecting `requestState`](#protecting-requeststate)**). This is the mechanism that replaces server-initiated sampling and the rest of the push-style back-channel; see **[Deprecated features](https://py.sdk.modelcontextprotocol.io/v2/advanced/deprecated/index.md)**. # The low-level Server Source: https://py.sdk.modelcontextprotocol.io/v2/advanced/low-level-server/ `@mcp.tool()` is a layer. Underneath it is a second server class, `Server`, that speaks raw MCP: you hand it the protocol objects and it puts them on the wire, unchanged. `MCPServer` is built on top of it. You drop down when the convenience layer is in the way: * You need to emit an **exact** schema (loaded from a file, generated from a database), not one derived from a Python signature. * You need full control of the result: `_meta`, `is_error`, every key of `structured_content`. * You need to handle a method MCP doesn't define. For everything else, stay on `MCPServer`. ## The same tool, by hand This is `search_books` from **[Tools](https://py.sdk.modelcontextprotocol.io/v2/tutorial/tools/index.md)** (the nine-line `@mcp.tool()` file) with the sugar removed: ```python title="server.py" hl_lines="23 27 33" # docs_src/lowlevel/tutorial001.py from mcp_types import ( CallToolRequestParams, CallToolResult, ListToolsResult, PaginatedRequestParams, TextContent, Tool, ) from mcp.server import Server, ServerRequestContext SEARCH_BOOKS = Tool( name="search_books", description="Search the catalog by title or author.", input_schema={ "type": "object", "properties": {"query": {"type": "string"}, "limit": {"type": "integer"}}, "required": ["query", "limit"], }, ) async def list_tools(ctx: ServerRequestContext, params: PaginatedRequestParams | None) -> ListToolsResult: return ListToolsResult(tools=[SEARCH_BOOKS]) async def call_tool(ctx: ServerRequestContext, params: CallToolRequestParams) -> CallToolResult: args = params.arguments or {} text = f"Found 3 books matching {args['query']!r} (showing up to {args['limit']})." return CallToolResult(content=[TextContent(type="text", text=text)]) server = Server("Bookshop", on_list_tools=list_tools, on_call_tool=call_tool) ``` Three things changed, and they are the whole low-level API: * **Handlers are constructor parameters.** `on_list_tools=` and `on_call_tool=` go into `Server(...)`. There are no decorators down here, and every handler has the same shape: `async (ctx, params) -> result`. * **You write the input schema.** `Tool.input_schema` is a plain JSON Schema `dict`. Nobody derives it from type hints, because there are no type hints to derive it from. * **You build the result.** `CallToolResult(content=[TextContent(...)])`, by hand. Nothing is wrapped, converted, or inferred from a return annotation. `params` is the parsed request: `CallToolRequestParams` gives you `.name` and `.arguments`. `ctx` is a `ServerRequestContext`: `ctx.session` for talking back to the client, `ctx.lifespan_context`, `ctx.request_id`, and `ctx.meta`, the request's inbound `_meta`. !!! info If you've used FastAPI, you already know this relationship. `MCPServer` is the decorators-and-type-hints layer; `Server` is the Starlette underneath. They are not rivals: `MCPServer` constructs a `Server` and registers handlers exactly like these on it. ### Try it There is no Inspector for this one: `mcp dev` and `mcp run` only accept an `MCPServer`. The in-memory `Client` doesn't care; it takes a low-level `Server` exactly like it takes an `MCPServer`: ```python title="main.py" import asyncio from mcp import Client from server import server async def main() -> None: async with Client(server) as client: result = await client.call_tool("search_books", {"query": "dune", "limit": 5}) print(result.content) asyncio.run(main()) ``` ```text [TextContent(type='text', text="Found 3 books matching 'dune' (showing up to 5).", annotations=None, meta=None)] ``` The same text the `@mcp.tool()` version produced. Two honest differences: * `result.structured_content` is `None`. The high-level server wrapped your `-> str` into `{"result": ...}`; here nobody builds what you didn't build. * `list_tools` returns the schema **you** typed, character for character. The high-level version had `"title": "Query"` on every property and a `"title": "search_booksArguments"` at the root: Pydantic artifacts. Down here, if it's on the wire, you put it there. ## Nothing is checked for you In **[Tools](https://py.sdk.modelcontextprotocol.io/v2/tutorial/tools/index.md)** you saw a bad argument get rejected before your function ran. That was `MCPServer` validating the call against the schema it generated. `Server` does not do that. Your `input_schema` is *advertised* to the client; it is never *applied* to `params.arguments`. !!! check Call `search_books` without `limit` and your `args["limit"]` raises `KeyError`. The client sees: ```text MCPError: Internal server error ``` A JSON-RPC error, code `-32603`, with a deliberately generic message: the SDK won't leak your traceback to a remote caller. The model never finds out what it did wrong, so it can't retry. (In a test, `raise_exceptions=True` surfaces the real exception instead; see **[Testing](https://py.sdk.modelcontextprotocol.io/v2/tutorial/testing/index.md)**.) That generalises. An exception raised from a low-level handler is **always** a protocol error, never an `is_error=True` tool result. If you want the model to read the failure and recover, validate `params.arguments` yourself and return `CallToolResult(content=[TextContent(...)], is_error=True)`. The two kinds of failure are the subject of **[Handling errors](https://py.sdk.modelcontextprotocol.io/v2/tutorial/handling-errors/index.md)**. ## Two tools, one handler `on_call_tool` is the single entry point for every tool on the server. You route on `params.name`: ```python title="server.py" hl_lines="39-44" # docs_src/lowlevel/tutorial002.py from mcp_types import ( CallToolRequestParams, CallToolResult, ListToolsResult, PaginatedRequestParams, TextContent, Tool, ) from mcp.server import Server, ServerRequestContext SEARCH_BOOKS = Tool( name="search_books", description="Search the catalog by title or author.", input_schema={ "type": "object", "properties": {"query": {"type": "string"}, "limit": {"type": "integer"}}, "required": ["query", "limit"], }, ) ADD_BOOK = Tool( name="add_book", description="Add a book to the catalog.", input_schema={ "type": "object", "properties": {"title": {"type": "string"}, "author": {"type": "string"}, "year": {"type": "integer"}}, "required": ["title", "author", "year"], }, ) async def list_tools(ctx: ServerRequestContext, params: PaginatedRequestParams | None) -> ListToolsResult: return ListToolsResult(tools=[SEARCH_BOOKS, ADD_BOOK]) async def call_tool(ctx: ServerRequestContext, params: CallToolRequestParams) -> CallToolResult: args = params.arguments or {} if params.name == "search_books": text = f"Found 3 books matching {args['query']!r} (showing up to {args['limit']})." elif params.name == "add_book": text = f"Added {args['title']!r} by {args['author']} ({args['year']})." else: raise ValueError(f"Unknown tool: {params.name}") return CallToolResult(content=[TextContent(type="text", text=text)]) server = Server("Bookshop", on_list_tools=list_tools, on_call_tool=call_tool) ``` * `list_tools` advertises both. `call_tool` dispatches on the name. * The `else` branch matters: `Server` will happily forward a `tools/call` for a name you never listed straight into your handler. Raising there turns the call into the same `-32603` as above. ## Structured output, by hand Declare `output_schema` on the `Tool` and put `structured_content` on the result. Both are yours: ```python title="server.py" hl_lines="20-24 37" # docs_src/lowlevel/tutorial003.py from mcp_types import ( CallToolRequestParams, CallToolResult, ListToolsResult, PaginatedRequestParams, TextContent, Tool, ) from mcp.server import Server, ServerRequestContext SEARCH_BOOKS = Tool( name="search_books", description="Search the catalog by title or author.", input_schema={ "type": "object", "properties": {"query": {"type": "string"}, "limit": {"type": "integer"}}, "required": ["query", "limit"], }, output_schema={ "type": "object", "properties": {"matches": {"type": "integer"}, "query": {"type": "string"}}, "required": ["matches", "query"], }, ) async def list_tools(ctx: ServerRequestContext, params: PaginatedRequestParams | None) -> ListToolsResult: return ListToolsResult(tools=[SEARCH_BOOKS]) async def call_tool(ctx: ServerRequestContext, params: CallToolRequestParams) -> CallToolResult: args = params.arguments or {} data = {"matches": 3, "query": args["query"]} return CallToolResult( content=[TextContent(type="text", text=f"Found 3 books matching {args['query']!r}.")], structured_content=data, ) server = Server("Bookshop", on_list_tools=list_tools, on_call_tool=call_tool) ``` Call it and the result carries both representations: ```json { "content": [{"type": "text", "text": "Found 3 books matching 'dune'."}], "structuredContent": {"matches": 3, "query": "dune"}, "isError": false, "resultType": "complete" } ``` The server never compares the two fields. This SDK's `Client` does: return `structured_content` that doesn't satisfy the `output_schema` you declared and `call_tool` raises a `RuntimeError` that starts with `Invalid structured content returned by tool search_books` and goes on to quote the `jsonschema` failure. Promising a schema is cheap; keeping it is on you. The whole ladder of return types and schemas is in **[Structured Output](https://py.sdk.modelcontextprotocol.io/v2/tutorial/structured-output/index.md)**. ## `_meta`: for the application, not the model `content` is the part of the answer the model reads. `structured_content` is the same answer as typed data. `_meta` is the third channel: data that rides along with the result for the **client application**, without being part of the answer at all. Use it for record IDs, trace IDs, anything your UI needs and your prompt doesn't: ```python title="server.py" hl_lines="38" # docs_src/lowlevel/tutorial004.py from mcp_types import ( CallToolRequestParams, CallToolResult, ListToolsResult, PaginatedRequestParams, TextContent, Tool, ) from mcp.server import Server, ServerRequestContext SEARCH_BOOKS = Tool( name="search_books", description="Search the catalog by title or author.", input_schema={ "type": "object", "properties": {"query": {"type": "string"}, "limit": {"type": "integer"}}, "required": ["query", "limit"], }, output_schema={ "type": "object", "properties": {"matches": {"type": "integer"}, "query": {"type": "string"}}, "required": ["matches", "query"], }, ) async def list_tools(ctx: ServerRequestContext, params: PaginatedRequestParams | None) -> ListToolsResult: return ListToolsResult(tools=[SEARCH_BOOKS]) async def call_tool(ctx: ServerRequestContext, params: CallToolRequestParams) -> CallToolResult: args = params.arguments or {} data = {"matches": 3, "query": args["query"]} return CallToolResult( content=[TextContent(type="text", text=f"Found 3 books matching {args['query']!r}.")], structured_content=data, _meta={"bookshop/record_ids": ["bk_17", "bk_42", "bk_99"]}, ) server = Server("Bookshop", on_list_tools=list_tools, on_call_tool=call_tool) ``` * You construct it as `_meta=`, the wire name. The client reads it back as `result.meta`. * Namespace your keys (`bookshop/record_ids`). The `io.modelcontextprotocol/*` keys are reserved by the protocol. !!! warning `_meta` is a convention between you and the client application, not a guarantee about what reaches the model. The host decides what it renders. Never put a secret in any part of a tool result. ## Capabilities follow your handlers A `Server` advertises exactly the method families you gave it handlers for. The `Bookshop` above passes `on_list_tools` and `on_call_tool` and nothing else, so a client connecting to it sees: ```json {"tools": {"listChanged": false}} ``` No `resources`, no `prompts`: there is nothing to back them. Pass `on_list_prompts` and `prompts` appears; pass `on_completion` and `completions` appears. `MCPServer` always advertises tools, resources and prompts, whether you registered any or not, because its managers always exist. Down here the declaration *is* the constructor call. ## The lifespan generic `Server` is generic in the type its lifespan yields. Annotate it once and the object is typed everywhere it surfaces: ```python title="server.py" hl_lines="25-27 45-46 51" # docs_src/lowlevel/tutorial005.py from collections.abc import AsyncIterator from contextlib import asynccontextmanager from dataclasses import dataclass from mcp_types import ( CallToolRequestParams, CallToolResult, ListToolsResult, PaginatedRequestParams, TextContent, Tool, ) from mcp.server import Server, ServerRequestContext @dataclass class Catalog: books: list[str] def search(self, query: str) -> list[str]: return [title for title in self.books if query.lower() in title.lower()] @asynccontextmanager async def lifespan(server: Server[Catalog]) -> AsyncIterator[Catalog]: yield Catalog(books=["Dune", "Dune Messiah", "Children of Dune"]) SEARCH_BOOKS = Tool( name="search_books", description="Search the catalog by title or author.", input_schema={ "type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"], }, ) async def list_tools(ctx: ServerRequestContext[Catalog], params: PaginatedRequestParams | None) -> ListToolsResult: return ListToolsResult(tools=[SEARCH_BOOKS]) async def call_tool(ctx: ServerRequestContext[Catalog], params: CallToolRequestParams) -> CallToolResult: matches = ctx.lifespan_context.search((params.arguments or {})["query"]) text = f"Found {len(matches)} books: {', '.join(matches)}." return CallToolResult(content=[TextContent(type="text", text=text)]) server = Server("Bookshop", lifespan=lifespan, on_list_tools=list_tools, on_call_tool=call_tool) ``` * The lifespan is a `Callable[[Server[Catalog]], AbstractAsyncContextManager[Catalog]]`; `@asynccontextmanager` on an `async` generator gives you exactly that. * Whatever it `yield`s becomes `ctx.lifespan_context`, and because the handlers are annotated `ServerRequestContext[Catalog]`, `.search(...)` autocompletes and type-checks. * It is entered once when the server starts and exited once when it stops. Startup, teardown, and `MCPServer`'s version of the same idea are in **[Lifespan](https://py.sdk.modelcontextprotocol.io/v2/tutorial/lifespan/index.md)**. Without a `lifespan=`, `ctx.lifespan_context` is an empty `dict`. ## A method of your own The constructor covers the methods MCP defines. `add_request_handler` covers everything else: ```python title="server.py" hl_lines="35-36 39-40 43-44 48" # docs_src/lowlevel/tutorial006.py from mcp_types import ( CallToolRequestParams, CallToolResult, ListToolsResult, PaginatedRequestParams, RequestParams, TextContent, Tool, ) from pydantic import BaseModel from mcp.server import Server, ServerRequestContext SEARCH_BOOKS = Tool( name="search_books", description="Search the catalog by title or author.", input_schema={ "type": "object", "properties": {"query": {"type": "string"}, "limit": {"type": "integer"}}, "required": ["query", "limit"], }, ) async def list_tools(ctx: ServerRequestContext, params: PaginatedRequestParams | None) -> ListToolsResult: return ListToolsResult(tools=[SEARCH_BOOKS]) async def call_tool(ctx: ServerRequestContext, params: CallToolRequestParams) -> CallToolResult: args = params.arguments or {} text = f"Found 3 books matching {args['query']!r} (showing up to {args['limit']})." return CallToolResult(content=[TextContent(type="text", text=text)]) class ReindexParams(RequestParams): full: bool = False class ReindexResult(BaseModel): indexed: int async def reindex(ctx: ServerRequestContext, params: ReindexParams) -> ReindexResult: return ReindexResult(indexed=3) server = Server("Bookshop", on_list_tools=list_tools, on_call_tool=call_tool) server.add_request_handler("bookshop/reindex", ReindexParams, reindex) ``` * The first argument is the method string. Notifications have a twin, `add_notification_handler`. * `params_type` is the model the incoming `params` are validated against **before** your handler runs, so custom methods *do* get the validation tools don't. Subclass `RequestParams` so the `_meta` field parses like every other method's. * The handler returns a `BaseModel`, a `dict`, or `None`. The SDK serialises it into the JSON-RPC result. One honest caveat: the high-level `Client` only has verbs for the methods MCP defines, so there is no `client.reindex()`. A vendor method is for a peer that already knows it exists: a client you also ship, or another service of yours speaking JSON-RPC. One method you cannot claim: ```text ValueError: 'initialize' is handled by the server runner and cannot be overridden; use Server.middleware to observe or wrap initialization ``` The handshake belongs to the runner. `server/discover`, `ping`, and every other built-in are yours to replace. !!! tip `Server.middleware`, mentioned in that error, wraps **every** inbound message, including `initialize`. If what you want is to observe or rewrite traffic rather than answer a new method, start at **[Middleware](https://py.sdk.modelcontextprotocol.io/v2/advanced/middleware/index.md)**. ## The other handlers Each of these is one idea you now have the vocabulary for; each has its own chapter. * `on_call_tool`, `on_get_prompt`, and `on_read_resource` may return an `InputRequiredResult` instead of their normal result to pause the call and ask the client for input; see **[Multi-round-trip requests](https://py.sdk.modelcontextprotocol.io/v2/advanced/multi-round-trip/index.md)**. True to this tier, nothing is installed for you: where `MCPServer` seals `requestState` by default, here the `request_state` you set crosses the wire exactly as written until you opt in with `server.middleware.append(RequestStateBoundary(RequestStateSecurity(keys=[...]), default_audience=server.name))`: one line (both names import from `mcp.server.request_state`) for the identical sealing and verification `MCPServer` performs (**[Protecting `requestState`](https://py.sdk.modelcontextprotocol.io/v2/advanced/multi-round-trip/index.md#protecting-requeststate)**). * `on_list_resources`, `on_read_resource`, `on_list_prompts`, `on_get_prompt`, `on_completion` are the same `(ctx, params) -> result` shape for the other primitives. * `server.streamable_http_app()` returns the same Starlette app `MCPServer`'s does; deploy it the way **[Running your server](https://py.sdk.modelcontextprotocol.io/v2/run/index.md)** deploys any other ASGI app. There is no `server.run(transport=...)` down here: `server.run(read_stream, write_stream, server.create_initialization_options())` drives one connection over a pair of streams, and that one line is the whole story. ## Recap * The low-level `Server` takes its handlers as `on_*` **constructor parameters**; every handler is `async (ctx, params) -> result`. * You write the `input_schema` dict and you build the `CallToolResult`. Nothing is derived, wrapped, or validated for you. * An exception in a handler is a `-32603` protocol error. A tool error the model can read is a `CallToolResult` with `is_error=True` that **you** return. * `_meta` on the result is addressed to the client application, not the model. * `Server[T]` is generic in what its lifespan yields; `ctx.lifespan_context` is a typed `T`. * `add_request_handler(method, params_type, handler)` serves any method. `initialize` is reserved. * The capabilities a `Server` advertises are derived from which handlers you registered. `Client(server)` treated both servers identically because they *are* the same protocol, which is the whole point. The next layer down isn't a class at all: it's **[Middleware](https://py.sdk.modelcontextprotocol.io/v2/advanced/middleware/index.md)**. # URI templates Source: https://py.sdk.modelcontextprotocol.io/v2/advanced/uri-templates/ This is the reference for the URI-template syntax that [`@mcp.resource`](https://py.sdk.modelcontextprotocol.io/v2/tutorial/resources/index.md) accepts, and for the path-safety policy the SDK applies to extracted values. For an introduction to what resources are and when to use them, start with **[Resources](https://py.sdk.modelcontextprotocol.io/v2/tutorial/resources/index.md)**; this page assumes you're already comfortable declaring a resource and want the full operator set, the security knobs, or the low-level wiring. The template syntax is [RFC 6570](https://datatracker.ietf.org/doc/html/rfc6570). The SDK supports a subset chosen for matching incoming `resources/read` URIs, plus a security layer that rejects values that would resolve outside the directory you intend to serve. For the protocol-level details (message formats, lifecycle, pagination) see the [MCP resources specification](https://modelcontextprotocol.io/specification/latest/server/resources). ## The full operator set **[Resources](https://py.sdk.modelcontextprotocol.io/v2/tutorial/resources/index.md)** showed one placeholder, `{user_id}`. There are four more operator forms; here they are on one server so you can see them next to each other: ```python title="server.py" hl_lines="16-17 22-23 28-29 34-35 40-41" # docs_src/uri_templates/tutorial001.py from mcp.server import MCPServer mcp = MCPServer("Bookshop") BOOKS = { "978-0441172719": {"title": "Dune", "author": "Frank Herbert"}, "978-0553293357": {"title": "Foundation", "author": "Isaac Asimov"}, } MANUALS = { "printing/setup.md": "# Printer setup\n\nLoad paper, then power on.", "returns.md": "# Returns policy\n\nThirty days with a receipt.", } @mcp.resource("books://{isbn}") def get_book(isbn: str) -> dict[str, str]: """A single book by ISBN.""" return BOOKS[isbn] @mcp.resource("orders://{order_id}") def get_order(order_id: int) -> dict[str, object]: """An order by its numeric id.""" return {"order_id": order_id, "next_order": order_id + 1, "status": "shipped"} @mcp.resource("manuals://{+path}") def read_manual(path: str) -> str: """A staff manual page. The path keeps its slashes.""" return MANUALS[path] @mcp.resource("reviews://{isbn}{?limit,sort}") def list_reviews(isbn: str, limit: int = 10, sort: str = "newest") -> str: """Reviews of a book, optionally limited and sorted.""" return f"{limit} {sort} reviews of {BOOKS[isbn]['title']}" @mcp.resource("shelves://browse{/path*}") def browse_shelf(path: list[str]) -> str: """A shelf in the category tree, addressed by segments.""" return " > ".join(["catalog", *path]) ``` Each highlighted decorator is a different way of carving up the URI. The sections below walk them top to bottom. ### Simple expansion: `{name}` `books://{isbn}` is the form you already know. The placeholder maps to the `isbn` parameter, so a client reading `books://978-0441172719` calls `get_book("978-0441172719")`. A plain `{name}` stops at the first `/`. `books://978/extra` does not match because the slash after `978` ends the capture and `/extra` is left over. ### Type conversion Extracted values arrive as strings, but you can declare a more specific type and the SDK will convert. `orders://{order_id}` lands in a function whose parameter is `order_id: int`, so reading `orders://12345` calls `get_order(12345)`, not `get_order("12345")`. The handler does arithmetic on it (`order_id + 1`) without a cast. ### Multi-segment paths: `{+name}` To capture a value that contains slashes, use `{+name}`. With `manuals://{+path}`: * `manuals://returns.md` gives `path = "returns.md"` * `manuals://printing/setup.md` gives `path = "printing/setup.md"` Reach for `{+name}` whenever the value is hierarchical: filesystem paths, nested object keys, URL paths you're proxying. ### Query parameters: `{?a,b,c}` `reviews://{isbn}{?limit,sort}` puts `limit` and `sort` after the `?`. The path identifies *which* book; the query tunes *how* you read it. Query params are matched leniently: order doesn't matter, extras are ignored, and omitted params fall through to your function defaults. So `reviews://978-0441172719` uses `limit=10, sort="newest"`, and `reviews://978-0441172719?sort=top` overrides only `sort`. ### Path segments as a list: `{/name*}` If you want each path segment as a separate list item rather than one string with slashes, use `{/name*}`. With `shelves://browse{/path*}`, a client reading `shelves://browse/fiction/sci-fi` calls `browse_shelf(["fiction", "sci-fi"])`. ### Template reference The most common patterns: | Pattern | Example input | You get | |--------------|-----------------------|-------------------------| | `{name}` | `alice` | `"alice"` | | `{name}` | `docs/intro.md` | *no match* (stops at `/`) | | `{+path}` | `docs/intro.md` | `"docs/intro.md"` | | `{.ext}` | `.json` | `"json"` | | `{/segment}` | `/v2` | `"v2"` | | `{?key}` | `?key=value` | `"value"` | | `{?a,b}` | `?a=1&b=2` | `"1"`, `"2"` | | `{/path*}` | `/a/b/c` | `["a", "b", "c"]` | ### What the parser rejects A few template shapes are caught up front rather than failing on the first request. `@mcp.resource` parses the template when the decorator runs, so none of these ever reach a running server. `UriTemplate.parse()` raises `InvalidUriTemplate` for: * **Two variables with nothing between them.** `manuals://{+path}{ext}` is rejected: matching can't tell where `path` ends and `ext` begins. Put a literal between them (`manuals://{+path}/{ext}`), or use an operator that supplies its own delimiter. `manuals://{+path}{.ext}` is accepted because `{.ext}` contributes the `.` itself. * **More than one multi-segment variable.** At most one of `{+var}`, `{#var}`, or an exploded variable (`{/var*}`, `{.var*}`, `{;var*}`) per template. Two are inherently ambiguous: there is no principled way to decide which one absorbs an extra segment. * **The usual syntax errors**: an unclosed brace, a variable name used twice, or an RFC 6570 feature the SDK doesn't support, such as the `{var:3}` prefix modifier or the `{?vars*}` query explode. On top of that, `@mcp.resource` raises `ValueError` when a handler parameter is bound to a query variable in the template's trailing `{?...}`/`{&...}` run but has no Python default. Those variables are matched leniently (a client may leave any of them out), so a parameter without a default would only surface as an opaque internal error on the first request that omits it. `reviews://{isbn}{?limit,sort}` in the server above is the well-formed version: `limit` and `sort` both carry defaults. ## Security Template parameters come from the client. If they flow into filesystem or database operations unchecked, values like `../../etc/passwd` can resolve outside the directory you intended to serve. ### What the SDK checks by default Before your handler runs, the SDK rejects any parameter that: * would escape its starting directory via `..` components * looks like an absolute path (`/etc/passwd`, `C:\Windows`) or a Windows drive-relative one (`C:foo`). A drive-relative value and a namespaced identifier like `x:y` are indistinguishable as strings, so any single-letter-plus-colon value is rejected by default; exempt the parameter if it legitimately receives such values * contains a null byte (`\x00`) The `..` check is component-based, not a substring scan. Values like `v1.0..v2.0` or `HEAD~3..HEAD` pass because `..` is not a standalone path segment there. These checks apply to the decoded value, so they catch traversal regardless of how it was encoded in the URI (`../etc`, `..%2Fetc`, `%2E%2E/etc`, `..%5Cetc`, `%00` all get caught). !!! check Read `manuals://../etc/passwd` from the server above and the request is rejected outright: template matching stops at the first failure, so no later (potentially more permissive) template is tried as a fallback. The client sees the same `-32602` "Unknown resource" error it would for a URI that matches no template at all, and `read_manual` never runs. ### Filesystem handlers: use safe_join The built-in checks stop the common cases but can't know your sandbox boundary. For filesystem access, use `safe_join` to resolve the path and verify it stays inside your base directory: ```python title="server.py" hl_lines="4 14" # docs_src/uri_templates/tutorial002.py from pathlib import Path from mcp.server import MCPServer from mcp.shared.path_security import safe_join mcp = MCPServer("Bookshop") DOCS_ROOT = Path("./manuals") @mcp.resource("manuals://{+path}") def read_manual(path: str) -> str: """A staff manual page, served from a directory on disk.""" return safe_join(DOCS_ROOT, path).read_text() ``` `safe_join` catches symlink escapes, `..` sequences, and absolute-path tricks that a simple string check would miss. If the resolved path escapes `DOCS_ROOT`, it raises `PathEscapeError`, which surfaces to the client as a `ResourceError`. ### When the defaults get in the way Sometimes the checks block legitimate values. A catalog-import tool might intentionally receive an absolute path, or a parameter might be a relative reference like `../sibling` that your handler interprets safely without touching the filesystem. Exempt that parameter, or relax the policy for the whole server: ```python title="server.py" hl_lines="9 16-19" # docs_src/uri_templates/tutorial003.py from mcp.server import MCPServer from mcp.server.mcpserver import ResourceSecurity mcp = MCPServer("Bookshop") @mcp.resource( "imports://preview/{+source}", security=ResourceSecurity(exempt_params={"source"}), ) def preview_import(source: str) -> str: """Preview a catalog import. `source` may be an absolute path.""" return f"Would import from {source}" relaxed = MCPServer( "Bookshop", resource_security=ResourceSecurity(reject_path_traversal=False), ) @relaxed.resource("imports://preview/{+source}") def preview_import_relaxed(source: str) -> str: """The server-wide flag exempts every resource on `relaxed`.""" return f"Would import from {source}" ``` * `security=ResourceSecurity(exempt_params={"source"})` on the decorator skips the checks for that one parameter on that one resource. The rest of the server keeps the default policy. * `resource_security=` on the `MCPServer` constructor sets the default for every resource. Here `relaxed` turns off the `..` check entirely. The configurable checks: | Setting | Default | What it does | |-------------------------|---------|-------------------------------------| | `reject_path_traversal` | `True` | Rejects `..` sequences that escape the starting directory | | `reject_absolute_paths` | `True` | Rejects `/foo`, `C:\foo`, UNC paths, and drive-relative `C:foo` (also catches `x:y`) | | `reject_null_bytes` | `True` | Rejects values containing `\x00` | | `exempt_params` | empty | Parameter names to skip checks for | These checks are a heuristic pre-filter; for filesystem access, `safe_join` remains the containment boundary. !!! tip If your handler can't fulfil the request (the file doesn't exist, the id is unknown), raise an exception. The SDK turns it into an error response. See **[Handling errors](https://py.sdk.modelcontextprotocol.io/v2/tutorial/handling-errors/index.md)** for the difference between a protocol error and a tool error. ## Resources on the low-level Server If you're building on the low-level `Server` (see **[The low-level Server](https://py.sdk.modelcontextprotocol.io/v2/advanced/low-level-server/index.md)**), you register handlers for the `resources/list` and `resources/read` protocol methods directly. There's no decorator; you return the protocol types yourself. ### Static resources For fixed URIs, keep a registry and dispatch on exact match: ```python title="server.py" hl_lines="18 22 28" # docs_src/uri_templates/tutorial004.py from mcp_types import ( ListResourcesResult, PaginatedRequestParams, ReadResourceRequestParams, ReadResourceResult, Resource, TextResourceContents, ) from mcp.server import Server, ServerRequestContext RESOURCES = { "config://shop": '{"currency": "USD", "tax_rate": 0.08}', "status://health": "ok", } async def list_resources(ctx: ServerRequestContext, params: PaginatedRequestParams | None) -> ListResourcesResult: return ListResourcesResult(resources=[Resource(name=uri, uri=uri) for uri in RESOURCES]) async def read_resource(ctx: ServerRequestContext, params: ReadResourceRequestParams) -> ReadResourceResult: if (text := RESOURCES.get(params.uri)) is not None: return ReadResourceResult(contents=[TextResourceContents(uri=params.uri, text=text)]) raise ValueError(f"Unknown resource: {params.uri}") server = Server("Bookshop", on_list_resources=list_resources, on_read_resource=read_resource) ``` The list handler tells clients what's available; the read handler serves the content. Check your registry first, fall through to templates (below) if you have any, then raise for anything else. ### Templates The template engine `MCPServer` uses lives in `mcp.shared.uri_template` and works on its own. You get the same parsing and matching; you wire up the routing and security policy yourself. ```python title="server.py" hl_lines="14-17 23-26 30 34 46" # docs_src/uri_templates/tutorial005.py from mcp_types import ( ListResourceTemplatesResult, PaginatedRequestParams, ReadResourceRequestParams, ReadResourceResult, ResourceTemplate, TextResourceContents, ) from mcp.server import Server, ServerRequestContext from mcp.shared.path_security import contains_path_traversal, is_absolute_path from mcp.shared.uri_template import UriTemplate TEMPLATES = { "manuals": UriTemplate.parse("manuals://{+path}"), "books": UriTemplate.parse("books://{isbn}"), } MANUALS = {"printing/setup.md": "# Printer setup", "returns.md": "# Returns policy"} BOOKS = {"978-0441172719": "Dune by Frank Herbert"} def read_manual_safely(path: str) -> str: if contains_path_traversal(path) or is_absolute_path(path): raise ValueError("rejected") return MANUALS[path] async def read_resource(ctx: ServerRequestContext, params: ReadResourceRequestParams) -> ReadResourceResult: if (matched := TEMPLATES["manuals"].match(params.uri)) is not None: text = read_manual_safely(str(matched["path"])) return ReadResourceResult(contents=[TextResourceContents(uri=params.uri, text=text)]) if (matched := TEMPLATES["books"].match(params.uri)) is not None: text = BOOKS[str(matched["isbn"])] return ReadResourceResult(contents=[TextResourceContents(uri=params.uri, text=text)]) raise ValueError(f"Unknown resource: {params.uri}") async def list_resource_templates( ctx: ServerRequestContext, params: PaginatedRequestParams | None ) -> ListResourceTemplatesResult: return ListResourceTemplatesResult( resource_templates=[ ResourceTemplate(name=name, uri_template=str(template)) for name, template in TEMPLATES.items() ] ) server = Server( "Bookshop", on_read_resource=read_resource, on_list_resource_templates=list_resource_templates, ) ``` Three things are happening in the highlighted lines: * **Parse once, match per request.** `UriTemplate.parse()` builds the template; `template.match(uri)` returns the extracted variables as a `dict`, or `None` if the URI doesn't fit. URL decoding happens inside `match()`; the decoded values are returned as-is without path-safety validation. Values come out as strings: convert them yourself (`int(matched["id"])`, `Path(matched["path"])`). * **Apply the safety checks yourself.** The `..` and absolute-path checks `MCPServer` runs by default live in `mcp.shared.path_security`. `read_manual_safely` calls them before touching `MANUALS`. If a parameter isn't a filesystem path (an ISBN, a search query), skip the checks for that value: you control the policy per handler rather than through a config object. * **List the templates from the same source.** Clients discover templates through `resources/templates/list`. `str(template)` gives back the original template string, so the listing and the matcher share one source of truth. ## Recap * `{name}` matches one segment; `{+name}` keeps the slashes; `{?a,b}` pulls from the query string; `{/name*}` splits segments into a list. * Two variables with nothing between them, or a second multi-segment variable, are rejected at parse time. A parameter bound to a trailing `{?...}`/`{&...}` query variable must declare a Python default. * Annotate the parameter (`order_id: int`) and the SDK converts. * The default security policy rejects `..`, absolute paths, and null bytes before your handler runs; override per resource with `security=ResourceSecurity(...)` or server-wide with `resource_security=`. * For filesystem access, `safe_join` is the containment boundary. * On the low-level `Server`, parse with `UriTemplate.parse()`, match with `.match()`, and apply `mcp.shared.path_security` yourself. # Pagination Source: https://py.sdk.modelcontextprotocol.io/v2/advanced/pagination/ Most servers never need this. `MCPServer` answers every `list_*` request with everything it has, in one page, `next_cursor=None`. For a few dozen tools, resources or prompts that is the right answer and there is nothing to configure. Pagination is for the server whose resource list is really a database: thousands of rows it refuses to serialize in one response. The protocol's answer is a **cursor**: the server returns a page plus an opaque token, and the client sends that token back to get the next page. `@mcp.resource()` has no hook for any of that. To page, you write the list handler yourself, on the **[low-level Server](https://py.sdk.modelcontextprotocol.io/v2/advanced/low-level-server/index.md)**. ## A server that pages ```python title="server.py" hl_lines="13 16-17" # docs_src/pagination/tutorial001.py from typing import Any from mcp_types import ListResourcesResult, PaginatedRequestParams, Resource from mcp.server import Server, ServerRequestContext BOOKS = [f"book-{n}" for n in range(1, 101)] PAGE_SIZE = 10 async def list_books(ctx: ServerRequestContext[Any], params: PaginatedRequestParams | None) -> ListResourcesResult: start = 0 if params is None or params.cursor is None else int(params.cursor) end = start + PAGE_SIZE page = [Resource(uri=f"books://catalog/{name}", name=name) for name in BOOKS[start:end]] next_cursor = str(end) if end < len(BOOKS) else None return ListResourcesResult(resources=page, next_cursor=next_cursor) server = Server("Bookshop", on_list_resources=list_books) ``` * On a low-level `Server`, handlers are constructor arguments, not decorators. `on_list_resources` answers every `resources/list` request; that's the whole hookup. * Every paged handler is typed `params: PaginatedRequestParams | None`, and the example accepts both. Over a connection, though, the SDK never hands you `None` (a request with no `params` member reaches the handler as the model with its defaults), so the signal that matters is `params.cursor is None`: **start from the top**. * You decide what a cursor *is*. Here it's an offset rendered as a string. A timestamp, a primary key, a base64 blob: anything you can mint on the way out and recognise on the way back in. * `next_cursor=None` is how you say "that was the last page". There is no count, no total, no `has_more`. `None` is the entire signal. !!! tip A `PAGE_SIZE` of 10 makes the example readable. Pick yours per endpoint: a list of one-line resources can afford a page of 500; a list of fat prompt templates cannot. The client has no say in it, and that is by design. ### Try it `Client(server)` connects to a low-level `Server` in memory exactly as it connects to an `MCPServer`. Call `list_resources()` with no arguments. You get ten resources, `book-1` through `book-10`, and `next_cursor` is the string `"10"`. Hand it back with `list_resources(cursor="10")` and the first resource is `book-11`, the new `next_cursor` is `"20"`. The tenth page comes back with `next_cursor` set to `None`. Done. ## The client loop Every `list_*` method on `Client` (`list_tools`, `list_resources`, `list_resource_templates`, `list_prompts`) takes a `cursor=` keyword. Draining a paged list is one `while True`: ```python title="client.py" hl_lines="27-33" # docs_src/pagination/tutorial002.py from typing import Any from mcp_types import ListResourcesResult, PaginatedRequestParams, Resource from mcp import Client from mcp.server import Server, ServerRequestContext BOOKS = [f"book-{n}" for n in range(1, 101)] PAGE_SIZE = 10 async def list_books(ctx: ServerRequestContext[Any], params: PaginatedRequestParams | None) -> ListResourcesResult: start = 0 if params is None or params.cursor is None else int(params.cursor) end = start + PAGE_SIZE page = [Resource(uri=f"books://catalog/{name}", name=name) for name in BOOKS[start:end]] next_cursor = str(end) if end < len(BOOKS) else None return ListResourcesResult(resources=page, next_cursor=next_cursor) server = Server("Bookshop", on_list_resources=list_books) async def main() -> None: async with Client(server) as client: resources: list[Resource] = [] cursor: str | None = None while True: page = await client.list_resources(cursor=cursor) resources.extend(page.resources) if page.next_cursor is None: break cursor = page.next_cursor print(f"{len(resources)} resources") ``` * `cursor` starts as `None`, so the first request carries no cursor. * Extend **before** you look at `next_cursor`: the last page has resources too. * `next_cursor is None` is the exit. Anything else goes straight back into `cursor=`, untouched. Run its `main()` and it prints `100 resources`: ten pages of ten, stitched together by a loop that never knew there were ten pages. This is the same loop **[The Client](https://py.sdk.modelcontextprotocol.io/v2/client/index.md)** chapter showed you, and it costs nothing against a server that doesn't page: `next_cursor` is `None` on the first response and the loop runs once. ## The three rules **Cursors are opaque.** A client must never parse, build, or guess one. The only legal source of a cursor is the previous page's `next_cursor`, verbatim. **The server picks the page size.** There is no `limit=` in the protocol. If you need a different page size, you change the server. **A client that ignores paging still works.** It calls `list_resources()` once, gets the first ten, and never notices the `next_cursor` it threw away. Nothing breaks; it sees less. !!! check Opaque means opaque. Invent a cursor (`list_resources(cursor="page-2")`) and there is nothing the protocol can do for you. This server tries `int("page-2")`, the handler raises, and what comes back to the client is: ```text MCPError(-32603, 'Internal server error', None) ``` A cursor you didn't get from the server is a bug, not a feature request. ## Recap * `MCPServer` returns everything in one page. Pagination is opt-in, and you opt in on the low-level `Server`. * `on_list_resources` (and `on_list_tools`, `on_list_prompts`, `on_list_resource_templates`) receives `PaginatedRequestParams | None`; `params.cursor` is `None` for the first page. * You return a page plus `next_cursor`: any string you'll recognise later, or `None` when there is nothing left. * The client loop: pass `cursor=`, accumulate, repeat until `next_cursor is None`. * Cursors are opaque, the server owns the page size, and a non-paging client still gets page one. The rest of the hand-written `Server` API (`on_call_tool`, `input_schema` dicts, `_meta`) is **[The low-level Server](https://py.sdk.modelcontextprotocol.io/v2/advanced/low-level-server/index.md)**. # Caching hints Source: https://py.sdk.modelcontextprotocol.io/v2/advanced/caching/ Every result a server returns for `tools/list`, `prompts/list`, `resources/list`, `resources/templates/list`, `resources/read` and `server/discover` carries two fields on the 2026-07-28 protocol: `ttlMs`, how many milliseconds a client may treat the result as fresh, and `cacheScope`, whether a cached result may be shared across users (`"public"`) or belongs to one authorization context (`"private"`). The server doesn't cache anything. The fields are a *declaration*: "this tool list is the same for everyone and won't change for a minute." A client (or a gateway in front of you) may then skip the round trip. Honoring the hints is the client's choice; emitting them is the server's job, and the SDK does it for you. Out of the box every result says `ttlMs: 0, cacheScope: "private"`: immediately stale, never shared. That is always safe and always conformant. If your lists really are stable and identical for all callers, say so at construction: ```python title="server.py" hl_lines="5-8" # docs_src/caching/tutorial001.py from mcp.server import CacheHint, MCPServer mcp = MCPServer( "Weather", cache_hints={ "tools/list": CacheHint(ttl_ms=60_000, scope="public"), "resources/read": CacheHint(ttl_ms=5_000), }, ) @mcp.tool() def forecast(city: str) -> str: return f"Sunny in {city}" @mcp.resource("config://units") def units() -> str: return "metric" ``` * The map is keyed by **method name**, and the six cacheable methods are the only legal keys. The parameter is typed `Mapping[CacheableMethod, CacheHint]`, so your editor autocompletes the keys and flags a typo before you run; anything that slips past the type checker raises at construction. * A method you don't mention keeps the defaults. The map is a set of overrides, not a manifest. * `CacheHint(ttl_ms=5_000)` left `scope` unset, so it stays `"private"`: five seconds of freshness, per caller. Scope and TTL are independent decisions. * `"server/discover"` is a legal key too, since the handshake result is cacheable like any list. !!! warning `cacheScope: "public"` means *anyone* may be served your cached response. A shared gateway will happily hand one user's result to another, even when the request was authenticated. Mark a result `"public"` only when it is identical for every caller, and never use `cacheScope` as access control: it is a label, not a lock. ## Per-handler override On the low-level `Server`, handlers build their results by hand, and `ttl_ms` / `cache_scope` are just fields on the result models. A handler that sets them explicitly always wins over the constructor map, field by field: ```python title="server.py" hl_lines="11 17" # docs_src/caching/tutorial002.py from typing import Any from mcp_types import ListToolsResult, PaginatedRequestParams, Tool from mcp.server import CacheHint, Server, ServerRequestContext TOOLS = [Tool(name="forecast", input_schema={"type": "object"})] async def list_tools(ctx: ServerRequestContext[Any], params: PaginatedRequestParams | None) -> ListToolsResult: return ListToolsResult(tools=TOOLS, ttl_ms=1_000) server = Server( "Weather", on_list_tools=list_tools, cache_hints={"tools/list": CacheHint(ttl_ms=60_000, scope="public")}, ) ``` The handler said `ttl_ms=1_000` and nothing about scope. On the wire: `ttlMs: 1000` (the handler's, not the map's `60_000`) and `cacheScope: "public"` (the map's, because the handler left it unset). Explicit beats configured, and configured beats default. This holds per field, so a handler can pin one field and leave the other to the server-wide policy. This is also the escape hatch for dynamics the constructor can't know: a handler that filters `resources/read` per user can return `cache_scope="private"` for one URI from an otherwise-public server. One caveat on paginated lists: the protocol requires the **same `cacheScope` on every page** of one list. The constructor map satisfies that by construction, since it's keyed by method, not by page. But a handler that overrides the scope itself owns that consistency: override it on *every* page, never only when a cursor is present, or page one and page two will disagree. ## What the client sees On a 2026-07-28 session, `Client` honors the hints for you: it has a built-in response cache, on by default. A result that arrives carrying a `ttlMs` is stored, and an identical call within that TTL is served from the cache with no round trip. A result that carries *no* hint is not cached: hint-less results get `CacheConfig.default_ttl_ms`, which defaults to `0` (immediately stale), so a server that declares nothing sees exactly the call-for-call traffic it always did. ```python title="client.py" hl_lines="34 36 39" # docs_src/caching/tutorial003.py from dataclasses import dataclass from typing import Any from mcp_types import ListToolsResult, PaginatedRequestParams, Tool from mcp import Client from mcp.client import CacheConfig from mcp.server import CacheHint, Server, ServerRequestContext @dataclass class DemoState: fetches: int = 0 now: float = 1_000_000.0 state = DemoState() async def list_tools(ctx: ServerRequestContext[Any], params: PaginatedRequestParams | None) -> ListToolsResult: state.fetches += 1 return ListToolsResult(tools=[Tool(name="forecast", input_schema={"type": "object"})]) server = Server( "Weather", on_list_tools=list_tools, cache_hints={"tools/list": CacheHint(ttl_ms=60_000, scope="public")}, ) async def main() -> None: start = state.fetches async with Client(server, cache=CacheConfig(clock=lambda: state.now)) as client: await client.list_tools() # fetch 1 await client.list_tools() # fresh for 60s: served from the cache state.now += 60.0 await client.list_tools() # the TTL ran out: fetch 2 await client.list_tools(cache_mode="refresh") # skip the cache read: fetch 3 print(f"4 calls, {state.fetches - start} fetches") ``` Four calls, three fetches. The second call found a fresh entry and never reached the server; advancing the (injected) clock past the TTL made the third fetch again; the fourth said `cache_mode="refresh"`. That kwarg exists on the five caching verbs (`list_tools`, `list_prompts`, `list_resources`, `list_resource_templates`, `read_resource`): * `"use"` (the default) serves a fresh entry if there is one, and stores the fetch if not. * `"refresh"` never serves: it fetches and stores the result, replacing whatever was cached. * `"bypass"` makes the round trip without touching the cache at all: no read, no write. One rule sits above `"use"`: **calls carrying `meta` always reach the server.** A request with `meta` set (a progress token, tracing fields) expects a wire request, so under `cache_mode="use"` it is treated as `"refresh"`: the cache read is skipped, and the fetched result still replaces the cached entry. `"bypass"` and an explicit `"refresh"` behave as they always do. To turn caching off entirely, construct with `Client(server, cache=False)`: every call is a round trip again, and `cache_mode`, while still accepted, does nothing. Scope is honored automatically too: `"private"` entries are keyed to the cache's *partition* (below), while `"public"` ones may opt into wider sharing. And **notifications beat TTL** for the exact entries they name: a `list_changed` notification evicts the matching cached listing, and `resources/updated` evicts the cached read stored under exactly its URI, however fresh they were. One caveat on `resources/updated`: eviction is exact-URI only. The store contract has no enumerate or scan operation (same as the reference TypeScript implementation), so a notification carrying a *sub*-resource URI does not evict a cached read of its parent. If your server signals sub-resources this way, refetch the parent with `cache_mode="refresh"`. ### Configuring it: `CacheConfig` ```python from mcp.client import CacheConfig client = Client("https://api.example.com/mcp", cache=CacheConfig(default_ttl_ms=5_000)) ``` * `store`: where entries live. The default is a fresh in-memory store per client; pass your own `ResponseCacheStore` implementation (Redis-backed, say) to share a cache across clients or processes. The contract types (`ResponseCacheStore`, `CacheKey`, `CacheEntry`, and the default `InMemoryResponseCacheStore`) are importable from `mcp.client`. A lookup may issue up to two sequential store `get`s (the private arm, then the public one), so size a remote store's latency expectations accordingly. A custom store **requires** an explicit `partition`. * `partition`: the authorization-context label that keeps one principal's `"private"` entries from being served to another within a shared store. * `target_id`: explicit server identity, for custom transports and in-process servers (below). * `default_ttl_ms`: TTL applied to results that carry no `ttlMs` hint. The default `0` leaves hint-less results uncached. * `share_public`: serve server-asserted-`"public"` entries across partitions (below). Off by default. * `clock`: the wall-clock source, in epoch seconds. Inject one, as the example above does, and expiry tests need no sleeping. !!! warning "Partition = verified principal" Derive `partition` from a **verified credential**, such as a validated token's subject. Never derive it from request-supplied data, and never from the server URL (server identity is a separate key axis). The SDK is a library with no authentication of its own: the trust anchor is whoever constructs the `CacheConfig`, which is the deployment, not the tenant. A multi-tenant gateway mints one `CacheConfig` per authenticated principal. The partition is also fixed for the `Client`'s lifetime. If the connection's authorization context changes mid-session (a re-authentication as a different principal, say), the cache does not follow; construct a new `Client` for the new principal. Cache keys also carry the **server's identity**: the URL string you dialed, with any `user:pass@` userinfo stripped and otherwise byte-exact. No case folding, no query reordering, no trailing-slash cleanup. Under-normalizing only costs sharing, while over-normalizing could merge two tenants (`?tenant=a` vs `?tenant=b`), so superficially different URLs simply don't share entries. When there is no URL (an in-process server, or a `Transport` instance), the client gets a random per-instance identity instead; set `CacheConfig.target_id` to name the server (with a custom store this is required, and construction says so). The identity is sha256-hashed before it enters key material, so a URL carrying secrets in its query string never appears in store keys. Don't log the pre-hash form yourself, either. !!! warning "`share_public` trusts the server, fleet-wide" By default even `"public"` entries stay within their partition. `share_public=True` serves entries the server marked `cacheScope: "public"` to **every** partition using the store, trusting the server's classification on behalf of all of them. A server that stamps `"public"` on per-tenant data (by bug or by malice) then leaks one tenant's response to the others. The flag is deliberately constructor-level only: the per-call `cache_mode` can narrow caching, but nothing per-call can widen sharing. ### What the cache never does * **Session-tier calls bypass it.** `client.session.list_tools()` and friends always make the round trip; the cache lives on the `Client` verbs. * **`server/discover` stays out of it.** The discover result is delivered once, at connect, and never enters the response cache, even when it carries a `ttlMs`. If you persist one yourself to skip the reconnect probe ([`prior_discover`](https://py.sdk.modelcontextprotocol.io/v2/client/protocol-versions/index.md#reconnecting-with-prior_discover)), its freshness is your bookkeeping: `DiscoverResult` carries `ttl_ms` and `cache_scope`, already parsed, for exactly that purpose. * **Continuation pages are never cached.** Only cursor-less calls participate. A continuation page rejected for an expired cursor does *evict* the cached listing, because the listing changed under it. * **Multi-round-trip reads are never cached.** A `read_resource` seeded with `input_responses`/`request_state`, or one that resolves through input rounds, never enters the cache (a spec MUST). * **Notification eviction needs notifications.** Eviction is only as good as the transport's delivery, and the modern in-process path (`Client(server)` with the default `mode="auto"`) does not deliver standalone notifications today. * **Eviction is eventual, not instantaneous.** Wire-path notifications are dispatched from spawned tasks, so a call racing a notification's arrival may be served the pre-eviction entry once more; the window is bounded by dispatch latency, and the eviction still lands. * **No stale-if-error.** An expired entry is never served because the refetch failed; the error propagates. * **No early re-fetch.** A stored entry is served until its TTL expires and the next call after that pays the round trip; nothing refreshes in the background. * **No coalescing.** Two concurrent identical calls are two fetches. * **No TTL beyond 24 hours.** A larger `ttlMs`, whether server-sent or configured, is clamped down on store (`mcp.client.caching.MAX_TTL_MS`), bounding how long any entry, however generously hinted, can be served. * On a **shared store**, clients race each other. Each client drops its own write when an eviction overtook the fetch in flight, but a *co-tenant* client can still write back an entry that an eviction it never saw had removed; and that race bookkeeping is itself bounded: past 4096 tracked keys the oldest key's guard is dropped first. Both windows are accepted, and closed by the TTL cap above. * **No serving across protocol eras.** Entries are scoped to the negotiated protocol version: on a shared persistent store, a session never serves an entry written under a different negotiated version (the same listing genuinely differs by era, since the SDK strips the 2026 fields for older sessions). Eviction likewise touches only the current era's entries; another era's entries simply age out by TTL. ### Reading the hints yourself The hints are also plain fields on every cacheable result (`result.ttl_ms` and `result.cache_scope`, already parsed), in case you want to layer your own bookkeeping on top of (or instead of) the built-in cache. Against an **older server** (pre-2026 protocol), the fields are simply absent from the wire, and the models show their conservative defaults: `ttl_ms == 0` and `cache_scope == "private"`, stale and unshared, the right assumption for a server that declared nothing. The cache treats a legacy session the same way: hints are never consulted there (whatever keys appear on the wire), only `default_ttl_ms` applies, and its default of `0` caches nothing, so a pre-2026 connection behaves exactly as it did before the cache existed. If you need to distinguish "the server said 0" from "the server said nothing", check `"ttl_ms" in result.model_fields_set`: it's only set when the field actually arrived. ## Older clients Clients on pre-2026 protocol versions never see either field; the SDK strips them at serialization for those connections. Configure your hints once; there is nothing version-specific to write. ## Recap * Six methods carry `ttlMs`/`cacheScope`; the SDK defaults them to `0`/`"private"`, stale and unshared, always safe. * `cache_hints={method: CacheHint(...)}` at construction (both `MCPServer` and `Server`) sets server-wide values per method. * A handler that sets the fields on its result overrides the map, per field. * `"public"` is a promise that the result is identical for every caller. It is not access control. * `Client` honors the hints automatically: its response cache is on by default, serves fresh entries instead of refetching, and caches nothing for servers (or sessions) that provide no hints. * Per call, `cache_mode="refresh"` refetches and `"bypass"` skips the cache; `cache=False` at construction turns it off entirely. # Middleware Source: https://py.sdk.modelcontextprotocol.io/v2/advanced/middleware/ A **middleware** is one async function that wraps every message your server receives. You write it as `async (ctx, call_next)` and append it to `server.middleware`. That is the whole API. !!! warning `Server.middleware` is marked **provisional** in the source. The signature and semantics are expected to change before v2 is final. Use it to *observe*: timing, logging, tracing. Do not make it the foundation your server stands on. This is a **low-level `Server`** feature. `MCPServer` does not expose a middleware list. If `Server(name, on_call_tool=...)` is new to you, read **[The low-level Server](https://py.sdk.modelcontextprotocol.io/v2/advanced/low-level-server/index.md)** first. ## A timing middleware One server, one tool, one middleware that logs how long each message took: ```python title="server.py" hl_lines="40-46 50" # docs_src/middleware/tutorial001.py import logging import time from mcp_types import ( CallToolRequestParams, CallToolResult, ListToolsResult, PaginatedRequestParams, TextContent, Tool, ) from mcp.server import Server, ServerRequestContext from mcp.server.context import CallNext, HandlerResult logger = logging.getLogger(__name__) async def on_list_tools(ctx: ServerRequestContext, params: PaginatedRequestParams | None) -> ListToolsResult: return ListToolsResult( tools=[ Tool( name="search_books", description="Search the catalog by title or author.", input_schema={ "type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"], }, ) ] ) async def on_call_tool(ctx: ServerRequestContext, params: CallToolRequestParams) -> CallToolResult: query = (params.arguments or {})["query"] return CallToolResult(content=[TextContent(type="text", text=f"Found 3 books matching {query!r}.")]) async def log_timing(ctx: ServerRequestContext, call_next: CallNext) -> HandlerResult: start = time.perf_counter() try: return await call_next(ctx) finally: elapsed_ms = (time.perf_counter() - start) * 1000 logger.info("%s took %.1f ms", ctx.method, elapsed_ms) server = Server("Bookshop", on_list_tools=on_list_tools, on_call_tool=on_call_tool) server.middleware.append(log_timing) ``` * `ctx` is the same `ServerRequestContext` your handlers receive. `ctx.method` is the raw method string; `ctx.params` are the raw params, **before** any validation. * `call_next(ctx)` runs the rest of the chain: validation, the handler lookup, your handler. Return what it returned and the response is untouched. * The `try`/`finally` is deliberate: a handler that raises is still timed, because the failure reaches your middleware as the exception out of `call_next`. * `server.middleware.append(...)` registers it. The list runs outermost-first, so `middleware[0]` is the one closest to the wire. ### Try it Connect a client, list the tools, call one. Your log has **three** lines: ```text server/discover took 18.3 ms tools/list took 0.1 ms tools/call took 0.1 ms ``` You made two calls and got three lines. The first is `server/discover`: the request the client sent to set the connection up, before you asked for anything. That is the point. Middleware wraps **every** inbound message: * The connection setup: `server/discover`, or `initialize` and `notifications/initialized` on a legacy session. * Every request and every notification. For a notification, `ctx.request_id is None`, `call_next(ctx)` returns `None`, and whatever you return is discarded. * Even a method the server has no handler for: `call_next` raises the `MCPError(-32601, "Method not found")` *through* your middleware on its way to the client. ## What you can do inside one In increasing order of how much you should hesitate: * **Observe.** Time it, count it, log it. The example above. * **Refuse.** Raise an `MCPError` *instead of* calling `call_next(ctx)` and that one message is answered with a JSON-RPC error. The connection stays up; the next message goes through. * **Rewrite.** `ctx` is a dataclass: `await call_next(dataclasses.replace(ctx, params=...))` hands the rest of the chain different params than the client sent. Never do this to `initialize`: the result the client gets back is built from your rewritten params, but the server commits its connection state from the original wire params. The two sides can finish the handshake disagreeing about what they negotiated. !!! check `initialize` is one of the things middleware wraps, and it is the *only* hook you get for it. Try to take it over with `add_request_handler` and the SDK refuses: ```text ValueError: 'initialize' is handled by the server runner and cannot be overridden; use Server.middleware to observe or wrap initialization ``` !!! warning `initialize` is handled inline: the server reads no further inbound messages until your middleware chain returns. Awaiting a server-to-client request (`ctx.session.send_request(...)`, an elicitation) while handling `initialize` therefore **deadlocks the connection**: the response you are waiting for can never be read. Fire-and-forget notifications are fine. ## The one middleware that ships on by default The SDK ships exactly one middleware, and it is already on your server's list: the one that emits an OpenTelemetry span for every message. You don't append it, and most of the time you don't think about it. It is a no-op until you install an exporter, and it has its own page: **[OpenTelemetry](https://py.sdk.modelcontextprotocol.io/v2/advanced/opentelemetry/index.md)**. !!! info If you have written ASGI middleware, you already know this shape. Starlette's `(scope, receive, send)` became `(ctx, call_next)`, and it runs *after* the transport, on the decoded message instead of the raw HTTP request. The two compose: Starlette middleware on `streamable_http_app()` sees HTTP; this sees MCP. ## Recap * A middleware is `async (ctx, call_next) -> result`, appended to `server.middleware` on the low-level `Server`. * It wraps **every** inbound message (`server/discover`, `initialize`, requests, notifications, unknown methods) and runs outermost-first. * `ctx.request_id is None` is how you tell a notification from a request. * Raise instead of calling `call_next` to refuse one message; the connection survives. * The SDK's own OpenTelemetry tracing is a middleware too, already on the list. See **[OpenTelemetry](https://py.sdk.modelcontextprotocol.io/v2/advanced/opentelemetry/index.md)**. * The whole surface is provisional. Observe with it; don't build on it. That is everything that wraps a request. **[Authorization](https://py.sdk.modelcontextprotocol.io/v2/advanced/authorization/index.md)** is what decides whether the request gets to run at all. # Extensions Source: https://py.sdk.modelcontextprotocol.io/v2/advanced/extensions/ An **extension** is an opt-in bundle of MCP behaviour behind one identifier. On a server it can contribute tools, resources, and new request methods, and it can wrap `tools/call`. On a client it can claim extra `tools/call` result shapes and observe vendor notifications. Each side advertises under its own `capabilities.extensions`, and nothing changes for anyone who didn't ask for it. That is the contract ([SEP-2133](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2133)), and it has one golden rule: **extensions are off by default**. ## Using an extension Pass instances at construction: ```python title="server.py" # docs_src/extensions/tutorial001.py from mcp.server.apps import Apps from mcp.server.mcpserver import MCPServer mcp = MCPServer("demo", extensions=[Apps()]) ``` Done. The server now advertises `io.modelcontextprotocol/ui` under `capabilities.extensions` and serves everything the extension contributes. `Apps` is the built-in reference extension, and it gets its own page: **[MCP Apps](https://py.sdk.modelcontextprotocol.io/v2/advanced/apps/index.md)**. !!! note Extensions are fixed at construction. There is no `add_extension` to call later: a server's capability map should not change while clients are connected to it. The capability map rides `server/discover`, which is a **2026-07-28** path. A legacy `initialize` handshake has nowhere to put it, so a legacy client simply doesn't see the extension. Design for that: an extension *augments* a server, it must not be the only way the server is usable. ## Writing your own Subclass `Extension` and override only what you need. Every method has a default. ### The identifier ```python # docs_src/extensions/tutorial002.py from mcp.server.extension import Extension class Stamps(Extension): identifier = "com.example/stamps" ``` The identifier is a `vendor-prefix/name` string following the spec's `_meta` key grammar: dot-separated labels (each starts with a letter, ends with a letter or digit), a slash, then the name. It is validated **when the class is defined**, so a typo doesn't wait for a server to boot: ```text TypeError: Stamps.identifier must be a `vendor-prefix/name` string (reverse-DNS prefix required), got 'stamps' ``` Use a domain you control as the prefix. `io.modelcontextprotocol/*` is for extensions specified by the MCP project itself. ### Contributing tools The smallest useful extension is one tool and a settings map: ```python title="server.py" hl_lines="17 19-20 22-23 26" # docs_src/extensions/tutorial003.py from collections.abc import Sequence from typing import Any from mcp import Client from mcp.server.extension import Extension, ToolBinding from mcp.server.mcpserver import MCPServer def stamp(text: str) -> str: """Stamp a message with the office seal.""" return f"[stamped] {text}" class Stamps(Extension): """A purely additive extension: one tool, one capability entry.""" identifier = "com.example/stamps" def settings(self) -> dict[str, Any]: return {"sealed": True} def tools(self) -> Sequence[ToolBinding]: return [ToolBinding(fn=stamp)] mcp = MCPServer("post-office", extensions=[Stamps()]) async def main() -> None: async with Client(mcp) as client: print(client.server_capabilities.extensions) # {'com.example/stamps': {'sealed': True}} result = await client.call_tool("stamp", {"text": "hello"}) print(result.content) # [TextContent(text='[stamped] hello')] ``` * `tools()` returns `ToolBinding`s. The server registers each one exactly as if you had called `mcp.add_tool(...)` yourself: same schema generation, same `Context` injection, same everything. * `settings()` is the value advertised at `capabilities.extensions["com.example/stamps"]`. Return `{}` (the default) to advertise the extension with no settings. * The extension never receives the server. It declares contributions as data; `MCPServer` consumes them. There is no `self.server` to mutate. And `main()` is the proof, an in-memory client straight against `mcp`: ```python title="server.py" hl_lines="29-34" # docs_src/extensions/tutorial003.py from collections.abc import Sequence from typing import Any from mcp import Client from mcp.server.extension import Extension, ToolBinding from mcp.server.mcpserver import MCPServer def stamp(text: str) -> str: """Stamp a message with the office seal.""" return f"[stamped] {text}" class Stamps(Extension): """A purely additive extension: one tool, one capability entry.""" identifier = "com.example/stamps" def settings(self) -> dict[str, Any]: return {"sealed": True} def tools(self) -> Sequence[ToolBinding]: return [ToolBinding(fn=stamp)] mcp = MCPServer("post-office", extensions=[Stamps()]) async def main() -> None: async with Client(mcp) as client: print(client.server_capabilities.extensions) # {'com.example/stamps': {'sealed': True}} result = await client.call_tool("stamp", {"text": "hello"}) print(result.content) # [TextContent(text='[stamped] hello')] ``` ### Serving your own methods An extension can register **new request methods**: its own verbs, served next to the spec's: ```python title="server.py" hl_lines="16-22 31 40-48" # docs_src/extensions/tutorial004.py from collections.abc import Sequence from typing import Any, Literal import mcp_types as types from pydantic import Field from mcp import Client from mcp.client import advertise from mcp.server.context import ServerRequestContext from mcp.server.extension import Extension, MethodBinding from mcp.server.mcpserver import MCPServer, require_client_extension EXTENSION_ID = "com.example/search" class SearchParams(types.RequestParams): query: str limit: int = Field(default=10, ge=1, le=100) class SearchResult(types.Result): items: list[str] class SearchRequest(types.Request[SearchParams, Literal["com.example/search"]]): method: Literal["com.example/search"] = "com.example/search" params: SearchParams async def search(ctx: ServerRequestContext[Any, Any], params: SearchParams) -> SearchResult: require_client_extension(ctx, EXTENSION_ID) return SearchResult(items=[f"{params.query}-{n}" for n in range(params.limit)]) class Search(Extension): """An extension that serves its own request method.""" identifier = EXTENSION_ID def methods(self) -> Sequence[MethodBinding]: return [ MethodBinding( "com.example/search", SearchParams, search, protocol_versions=frozenset({"2026-07-28"}), ) ] mcp = MCPServer("catalog", extensions=[Search()]) async def main() -> None: async with Client(mcp, extensions=[advertise(EXTENSION_ID)]) as client: request = SearchRequest(params=SearchParams(query="mcp", limit=3)) result = await client.session.send_request(request, SearchResult) print(result.items) # ['mcp-0', 'mcp-1', 'mcp-2'] ``` * `SearchParams` subclasses `RequestParams`, so the 2026 `_meta` envelope parses uniformly and your handler gets validated params, never a raw dict. Bound what the client controls: `Field(ge=1, le=100)` rejects an absurd `limit` before your code allocates anything for it. * `require_client_extension(ctx, EXTENSION_ID)` is the gate: a client that did not declare the extension gets the `-32021` (missing required client capability) error, with the machine-readable `requiredCapabilities` payload the spec asks for. * `protocol_versions=frozenset({"2026-07-28"})` pins the method to one wire version. At any other version the client gets `METHOD_NOT_FOUND`, exactly as if the method didn't exist there. For that client, it doesn't. Methods are **strictly additive**. The SDK enforces this at construction, not at runtime: * A `MethodBinding` for a spec-defined method (`tools/list`, `completion/complete`, ...) raises `ValueError` when the binding is constructed. Core verbs belong to the server. * Two extensions binding the same method raise when the second one registers. Last-write-wins is how plugins corrupt each other; we don't do that. * An empty `protocol_versions` set raises too: a method that can never be served is a bug, not a configuration. ### The client side The same file's `main()` is the whole client story, both halves of it: ```python title="server.py" hl_lines="54-58" # docs_src/extensions/tutorial004.py from collections.abc import Sequence from typing import Any, Literal import mcp_types as types from pydantic import Field from mcp import Client from mcp.client import advertise from mcp.server.context import ServerRequestContext from mcp.server.extension import Extension, MethodBinding from mcp.server.mcpserver import MCPServer, require_client_extension EXTENSION_ID = "com.example/search" class SearchParams(types.RequestParams): query: str limit: int = Field(default=10, ge=1, le=100) class SearchResult(types.Result): items: list[str] class SearchRequest(types.Request[SearchParams, Literal["com.example/search"]]): method: Literal["com.example/search"] = "com.example/search" params: SearchParams async def search(ctx: ServerRequestContext[Any, Any], params: SearchParams) -> SearchResult: require_client_extension(ctx, EXTENSION_ID) return SearchResult(items=[f"{params.query}-{n}" for n in range(params.limit)]) class Search(Extension): """An extension that serves its own request method.""" identifier = EXTENSION_ID def methods(self) -> Sequence[MethodBinding]: return [ MethodBinding( "com.example/search", SearchParams, search, protocol_versions=frozenset({"2026-07-28"}), ) ] mcp = MCPServer("catalog", extensions=[Search()]) async def main() -> None: async with Client(mcp, extensions=[advertise(EXTENSION_ID)]) as client: request = SearchRequest(params=SearchParams(query="mcp", limit=3)) result = await client.session.send_request(request, SearchResult) print(result.items) # ['mcp-0', 'mcp-1', 'mcp-2'] ``` * `Client(..., extensions=[advertise(EXTENSION_ID)])` declares the extension. The declarations become `ClientCapabilities.extensions`: on a 2026-07-28 connection the map travels in the per-request `_meta` envelope, so the server sees it on **every** request; on a legacy connection it rides the `initialize` handshake. Server code doesn't care which: `require_client_extension(ctx, ...)` and `ctx.session.check_client_capability(...)` read the right source on both paths. * Vendor methods drop one layer to `client.session.send_request(...)`; `Client` only grows first-class methods for spec verbs. `send_request` accepts any `Request` subclass, so the vendor request passes as-is. ### Intercepting `tools/call` The one interceptive hook. Override `intercept_tool_call` to observe, short-circuit, or veto a tool call: ```python title="server.py" hl_lines="18-25" # docs_src/extensions/tutorial005.py import logging from typing import Any from mcp_types import CallToolRequestParams from mcp.server.context import CallNext, HandlerResult, ServerRequestContext from mcp.server.extension import Extension from mcp.server.mcpserver import MCPServer logger = logging.getLogger(__name__) class AuditLog(Extension): """Observe every tools/call without touching its result.""" identifier = "com.example/audit" async def intercept_tool_call( self, params: CallToolRequestParams, ctx: ServerRequestContext[Any, Any], call_next: CallNext, ) -> HandlerResult: logger.info("tool %r called", params.name) return await call_next(ctx) mcp = MCPServer("audited", extensions=[AuditLog()]) @mcp.tool() def add(a: int, b: int) -> int: """Add two numbers.""" return a + b ``` * `params` is the validated `CallToolRequestParams`: you get `params.name` and `params.arguments` without touching raw JSON. * `call_next(ctx)` runs the rest of the chain. Return its result unchanged (observe), return something else (replace), or raise an `MCPError` (refuse). * With several extensions, interceptors nest in registration order: the first extension in `extensions=[...]` is outermost. * The default implementation is a pass-through, and a server whose extensions never override this hook installs **no** middleware at all. You don't pay for what you don't use. The hook wraps `tools/call` and nothing else. For every-message concerns, use [Middleware](https://py.sdk.modelcontextprotocol.io/v2/advanced/middleware/index.md). That is what it is for. ## Using a client extension A **client extension** is the same contract from the consuming side: a bundle of client-side behaviour behind one identifier. Pass instances to `Client(extensions=[...])` and call tools normally: ```python title="client.py" hl_lines="67-69" # docs_src/extensions/tutorial006.py from collections.abc import Sequence from typing import Any, Literal import mcp_types as types from mcp import Client from mcp.client import ClaimContext, ClientExtension, ResultClaim from mcp.server.context import CallNext, HandlerResult, ServerRequestContext from mcp.server.extension import Extension from mcp.server.mcpserver import MCPServer, require_client_extension EXTENSION_ID = "com.example/receipts" class ReceiptResult(types.Result): """The claimed result shape; `result_type` pins the wire tag.""" result_type: Literal["receipt"] = "receipt" receipt_token: str class ReceiptIssuer(Extension): """Server half: answers `buy` with a receipt instead of a final result.""" identifier = EXTENSION_ID async def intercept_tool_call( self, params: types.CallToolRequestParams, ctx: ServerRequestContext[Any, Any], call_next: CallNext, ) -> HandlerResult: if params.name != "buy": return await call_next(ctx) require_client_extension(ctx, EXTENSION_ID) return {"resultType": "receipt", "receiptToken": "r-117"} class Receipts(ClientExtension): """Client half: claims the `receipt` shape and supplies the code that finishes it.""" identifier = EXTENSION_ID def claims(self) -> Sequence[ResultClaim[Any]]: return [ResultClaim(result_type="receipt", model=ReceiptResult, resolve=self._redeem)] async def _redeem(self, claimed: ReceiptResult, ctx: ClaimContext) -> types.CallToolResult: return await ctx.session.call_tool("redeem", {"token": claimed.receipt_token}) mcp = MCPServer("shop", extensions=[ReceiptIssuer()]) @mcp.tool() def buy(item: str) -> types.CallToolResult: """Buy an item.""" raise NotImplementedError # ReceiptIssuer answers `buy` before the tool runs @mcp.tool() def redeem(token: str) -> str: """Exchange a receipt token for the goods.""" return f"goods for {token}" async def main() -> None: async with Client(mcp, extensions=[Receipts()]) as client: result = await client.call_tool("buy", {"item": "lamp"}) print(result.content) # [TextContent(text='goods for r-117')] ``` `call_tool("buy", ...)` returns a plain `CallToolResult`, like every other call. What the extension changed: the server may now answer `buy` with a `receipt` **result shape** instead of a final result, and `Receipts` finishes it (here by redeeming the receipt with a follow-up call) before `call_tool` returns. Nothing about the call site moves. Drop the extension and none of this exists: the server's gate refuses a client that did not declare it (error -32021), and a claimed shape from a server that skips the gate fails validation, exactly as the spec requires for an unrecognized `resultType`. Off by default, on both ends of the wire. To advertise an identifier with **no** client-side behaviour (the server gates on the capability, the client does nothing, as in the search client above), use `advertise()`: ```python from mcp.client import advertise client = Client(mcp, extensions=[advertise("com.example/search")]) ``` ## Writing a client extension Subclass `ClientExtension` and override only what you need. Three contribution kinds, each with a default: `settings()`, `claims()`, and `notifications()`. ```python title="client.py" hl_lines="18-19 44-45 47-48" # docs_src/extensions/tutorial006.py from collections.abc import Sequence from typing import Any, Literal import mcp_types as types from mcp import Client from mcp.client import ClaimContext, ClientExtension, ResultClaim from mcp.server.context import CallNext, HandlerResult, ServerRequestContext from mcp.server.extension import Extension from mcp.server.mcpserver import MCPServer, require_client_extension EXTENSION_ID = "com.example/receipts" class ReceiptResult(types.Result): """The claimed result shape; `result_type` pins the wire tag.""" result_type: Literal["receipt"] = "receipt" receipt_token: str class ReceiptIssuer(Extension): """Server half: answers `buy` with a receipt instead of a final result.""" identifier = EXTENSION_ID async def intercept_tool_call( self, params: types.CallToolRequestParams, ctx: ServerRequestContext[Any, Any], call_next: CallNext, ) -> HandlerResult: if params.name != "buy": return await call_next(ctx) require_client_extension(ctx, EXTENSION_ID) return {"resultType": "receipt", "receiptToken": "r-117"} class Receipts(ClientExtension): """Client half: claims the `receipt` shape and supplies the code that finishes it.""" identifier = EXTENSION_ID def claims(self) -> Sequence[ResultClaim[Any]]: return [ResultClaim(result_type="receipt", model=ReceiptResult, resolve=self._redeem)] async def _redeem(self, claimed: ReceiptResult, ctx: ClaimContext) -> types.CallToolResult: return await ctx.session.call_tool("redeem", {"token": claimed.receipt_token}) mcp = MCPServer("shop", extensions=[ReceiptIssuer()]) @mcp.tool() def buy(item: str) -> types.CallToolResult: """Buy an item.""" raise NotImplementedError # ReceiptIssuer answers `buy` before the tool runs @mcp.tool() def redeem(token: str) -> str: """Exchange a receipt token for the goods.""" return f"goods for {token}" async def main() -> None: async with Client(mcp, extensions=[Receipts()]) as client: result = await client.call_tool("buy", {"item": "lamp"}) print(result.content) # [TextContent(text='goods for r-117')] ``` * The identifier follows the same grammar as the server's, validated when the class is defined. * `claims()` returns `ResultClaim`s: a wire tag, the model that parses it, and the resolver that finishes it. The model must pin the tag with `result_type: Literal["receipt"]` and must not subclass the verb's core result types; both are enforced when the claim is constructed. Vendor fields like `receipt_token` ride the wire as-is: a substituted shape reaches the client verbatim. * The resolver receives the parsed model and a `ClaimContext`; `ctx.session` is the same public handle as `client.session`, so follow-ups are ordinary session calls. It returns the verb's normal `CallToolResult`. * `settings()` is the value advertised at `ClientCapabilities.extensions[identifier]`, read once at `Client` construction. `notifications()` declares vendor server notifications to observe: ```python def notifications(self) -> Sequence[NotificationBinding[Any]]: return [NotificationBinding(method="notifications/receipts", params_type=ReceiptEvent, handler=self.on_receipt)] ``` The handler receives validated params one at a time, in dispatch order. It observes; it cannot veto or reply. Two quiet rules. Claims are active on 2026-07-28 connections only, and the capability ad follows them: on a legacy connection the claims dissolve and the identifier drops out of the ad with them, so the client never advertises an extension whose shapes it would reject. And when you want the claimed shape yourself instead of the resolver, call `client.session.call_tool(..., allow_claimed=True)`; without that flag, a claimed shape reaching a session-tier caller raises `UnexpectedClaimedResult`. ### Extension verbs An extension's own request methods need no client-side registration. A vendor request type subclasses `mcp_types.Request` and goes through `client.session.send_request`, as in [Serving your own methods](#serving-your-own-methods). One addition: when a params key must ride the `Mcp-Name` header (extension specs such as tasks require this for their verbs), the request type declares `name_param`: ```python title="client.py" hl_lines="23-26 47-48" # docs_src/extensions/tutorial007.py from collections.abc import Sequence from typing import Any, Literal import mcp_types as types from mcp import Client from mcp.client import advertise from mcp.server.context import ServerRequestContext from mcp.server.extension import Extension, MethodBinding from mcp.server.mcpserver import MCPServer EXTENSION_ID = "com.example/jobs" class JobParams(types.RequestParams): job_id: str class JobStatus(types.Result): status: str class JobStatusRequest(types.Request[JobParams, Literal["com.example/jobs.status"]]): method: Literal["com.example/jobs.status"] = "com.example/jobs.status" params: JobParams name_param = "jobId" # params["jobId"] rides the Mcp-Name header async def job_status(ctx: ServerRequestContext[Any, Any], params: JobParams) -> JobStatus: return JobStatus(status=f"{params.job_id} is running") class Jobs(Extension): """An extension whose verb names its subject, so the header can route on it.""" identifier = EXTENSION_ID def methods(self) -> Sequence[MethodBinding]: return [MethodBinding("com.example/jobs.status", JobParams, job_status)] mcp = MCPServer("worker", extensions=[Jobs()]) async def main() -> None: async with Client(mcp, extensions=[advertise(EXTENSION_ID)]) as client: request = JobStatusRequest(params=JobParams(job_id="job-7")) result = await client.session.send_request(request, JobStatus) print(result.status) # job-7 is running ``` The session mirrors `params["jobId"]` into `Mcp-Name` on every send path, and a missing value fails loudly rather than silently omitting a required header. ## What an extension cannot do The contribution surface is **closed** on purpose. On the server: settings, tools, resources, methods, one `tools/call` interceptor. On the client: settings, result claims, notification bindings. An extension cannot: * **Reach into the host.** It declares data; it holds no server or client reference. * **Replace core behaviour.** Spec methods and core result tags are rejected at construction (`initialize` is reserved by the runner outright); a notification binding shadowed by core vocabulary goes quiet with a warning instead. * **Register late.** After `MCPServer(...)` or `Client(...)` returns, the extension set is what it is. If you are fighting these walls, you are not writing an extension. You are writing a fork. The walls are the feature: a user reading `extensions=[Apps(), Stamps()]` knows *everything* those two can have touched. # MCP Apps Source: https://py.sdk.modelcontextprotocol.io/v2/advanced/apps/ An **MCP App** is a tool with a face: alongside its data, the tool points at an HTML document the host renders as an interactive surface. Two parts, always two parts: 1. **A tool** that does the work and returns data, like any other tool. 2. **A `ui://` resource** containing the HTML the host shows for it. The tool carries a `_meta.ui.resourceUri` reference to the resource. The host fetches it with `resources/read`, renders it in a **sandboxed iframe**, and pushes the tool's result into that iframe via `postMessage`. Your server never sends or receives any `ui/*` messages: that traffic is between the host and the iframe. You serve a tool and an HTML document; the host does the theater. The SDK ships this as the built-in `Apps` extension (`io.modelcontextprotocol/ui`). If [Extensions](https://py.sdk.modelcontextprotocol.io/v2/advanced/extensions/index.md) are new to you, skim that page first. One minute, then come back. ## A clock with a face ```python title="server.py" hl_lines="18 21 29 31" # docs_src/apps/tutorial001.py from mcp import Client from mcp.client import advertise from mcp.server.apps import APP_MIME_TYPE, EXTENSION_ID, Apps, client_supports_apps from mcp.server.mcpserver import MCPServer from mcp.server.mcpserver.context import Context CLOCK_HTML = """\ Clock

...

""" apps = Apps() @apps.tool(resource_uri="ui://clock/app.html", description="The current time.") def get_time(ctx: Context) -> str: now = "2026-06-26T12:00:00Z" if not client_supports_apps(ctx): return f"The time is {now}." return now apps.add_html_resource("ui://clock/app.html", CLOCK_HTML, title="Clock") mcp = MCPServer("clock", extensions=[apps]) async def main() -> None: async with Client(mcp, extensions=[advertise(EXTENSION_ID, {"mimeTypes": [APP_MIME_TYPE]})]) as client: result = await client.call_tool("get_time", {}) print(result.content) # [TextContent(text='2026-06-26T12:00:00Z')] ``` Four moves: * `Apps()`: one instance holds your UI-bound tools and their resources. * `@apps.tool(resource_uri="ui://clock/app.html")`: a regular tool, plus the `_meta.ui.resourceUri` stamp. Everything `@mcp.tool()` accepts (name, title, description, ...) passes through. * `apps.add_html_resource("ui://clock/app.html", CLOCK_HTML)`: the matching resource, served as `text/html;profile=mcp-app`. That exact MIME type is what tells a host "this is an app, render it". * `MCPServer("clock", extensions=[apps])`: opt in. The server now advertises `io.modelcontextprotocol/ui` under `capabilities.extensions`. The HTML itself listens for the host's `postMessage` and shows the result. For real apps, use the official [`@modelcontextprotocol/ext-apps`](https://github.com/modelcontextprotocol/ext-apps) browser SDK inside your HTML. It gives you `ontoolresult`, `callServerTool`, `getHostContext`, and `onhostcontextchanged` instead of raw message events. ## Graceful degradation Not every client renders apps. The spec is blunt about what that means for you: > Tools **MUST** return a meaningful `content` array even when UI is available. The model reads `content`; the iframe is for humans. A UI-capable host still feeds the text result to the model, and a text-only client gets *only* that. So the canonical pattern is one tool, two answers. Look at `get_time` again: ```python title="server.py" hl_lines="22-26" # docs_src/apps/tutorial001.py from mcp import Client from mcp.client import advertise from mcp.server.apps import APP_MIME_TYPE, EXTENSION_ID, Apps, client_supports_apps from mcp.server.mcpserver import MCPServer from mcp.server.mcpserver.context import Context CLOCK_HTML = """\ Clock

...

""" apps = Apps() @apps.tool(resource_uri="ui://clock/app.html", description="The current time.") def get_time(ctx: Context) -> str: now = "2026-06-26T12:00:00Z" if not client_supports_apps(ctx): return f"The time is {now}." return now apps.add_html_resource("ui://clock/app.html", CLOCK_HTML, title="Clock") mcp = MCPServer("clock", extensions=[apps]) async def main() -> None: async with Client(mcp, extensions=[advertise(EXTENSION_ID, {"mimeTypes": [APP_MIME_TYPE]})]) as client: result = await client.call_tool("get_time", {}) print(result.content) # [TextContent(text='2026-06-26T12:00:00Z')] ``` `client_supports_apps(ctx)` is `True` only when the client declared the `io.modelcontextprotocol/ui` extension **and** listed `text/html;profile=mcp-app` in its `mimeTypes` settings. The field is required, so a client that omits it does not count. That is exactly what `main()` in the same file declares: the client half of the negotiation, and the rich answer comes back. !!! warning Never return a placeholder like `"[Rendered UI]"` as the only content. If the fallback text is useless, the tool is useless to every text-only client and to the model itself. Write the sentence. ## Locking the iframe down The resource side carries the security metadata: what the iframe may load, which browser permissions it wants, how it would like to be framed: ```python title="server.py" hl_lines="9 19-22" # docs_src/apps/tutorial002.py from mcp.server.apps import Apps, ResourceCsp, ResourcePermissions from mcp.server.mcpserver import MCPServer DASHBOARD_HTML = "Dashboard" apps = Apps() @apps.tool(resource_uri="ui://dashboard/app.html", visibility=["app"]) def refresh_dashboard() -> str: """Refresh the dashboard data.""" return "refreshed" apps.add_html_resource( "ui://dashboard/app.html", DASHBOARD_HTML, title="Dashboard", csp=ResourceCsp(connect_domains=["https://api.example.com"]), permissions=ResourcePermissions(clipboard_write={}), domain="dashboard.example.com", prefers_border=True, ) mcp = MCPServer("dashboard", extensions=[apps]) ``` `csp` and `permissions` are **requests to the host**, not server behaviour. The host builds the iframe's Content-Security-Policy and Permissions-Policy from them, and it may refuse. Feature-detect in your JS rather than assuming a grant. `ResourceCsp`, field by field (Python name, wire key, what the host does with it): | Python | Wire (`_meta.ui.csp`) | Controls | |---|---|---| | `connect_domains` | `connectDomains` | `connect-src`: where `fetch`/XHR may go | | `resource_domains` | `resourceDomains` | `img-src`, `style-src`, ...: static assets | | `frame_domains` | `frameDomains` | `frame-src`: nested iframes | | `base_uri_domains` | `baseUriDomains` | `base-uri`: what `` may point at | `ResourcePermissions`: each field requests a browser permission for the iframe. | Python | Wire (`_meta.ui.permissions`) | |---|---| | `camera` | `camera` | | `microphone` | `microphone` | | `geolocation` | `geolocation` | | `clipboard_write` | `clipboardWrite` | !!! note CSP and permissions live on the **resource**, never on the tool. The spec's tool metadata has no slot for them, and hosts ignore them there. The SDK makes the mistake unrepresentable: `@apps.tool()` simply has no `csp` parameter. ### Visibility `visibility=["app"]` on a tool says "this exists for the iframe, not the model": * `"model"`: the model may call it. * `"app"`: the iframe may call it (via `callServerTool`). * Omitted: both, which is the default. Filtering is the **host's** job. Your server lists app-only tools in `tools/list` like any other; the host hides them from the model. Don't filter server-side. ## The rules the SDK enforces All of these fail at startup, not in production: * A `resource_uri` or resource URI that isn't `ui://...` is a `ValueError` at decoration/registration time. * A tool bound to a URI with **no matching registered resource** is a `ValueError` when `MCPServer(extensions=[apps])` consumes the extension. A tool advertising HTML that 404s on `resources/read` is a misconfiguration, so it refuses to construct. * `meta={"ui": ...}` on `@apps.tool()` is a `ValueError`. The decorator owns `_meta["ui"]`; say it with `resource_uri=` and `visibility=`. Other `meta=` keys merge fine alongside. Neither the TypeScript ext-apps SDK nor FastMCP catches any of these today; we'd rather you find out before a host does. ## Beyond inline HTML `add_html_resource` covers the common case: a string of HTML. For anything else, HTML on disk or generated content, build the resource yourself and hand it over: ```python title="server.py" hl_lines="12 18" # docs_src/apps/tutorial003.py from pathlib import Path from mcp.server.apps import Apps from mcp.server.mcpserver import MCPServer from mcp.server.mcpserver.resources import FileResource REPORT_HTML = Path(__file__).parent / "report.html" apps = Apps() @apps.tool(resource_uri="ui://report/app.html") def refresh_report() -> str: """Refresh the report data.""" return "report refreshed" apps.add_resource(FileResource(uri="ui://report/app.html", name="report", path=REPORT_HTML)) mcp = MCPServer("report", extensions=[apps]) ``` `add_resource` fills in the `text/html;profile=mcp-app` MIME type when the resource doesn't set one explicitly, and rejects an explicit mismatch: a `ui://` resource under any other MIME type is one no host will render. !!! tip Targeting a pre-GA host that still reads the deprecated flat `_meta["ui/resourceUri"]` key? Merge it yourself: `@apps.tool(resource_uri="ui://x", meta={"ui/resourceUri": "ui://x"})`. The nested `ui` object is the spec shape; the flat key is on its way out. ## See it run The `apps` story in `examples/stories/` is this page as a runnable pair: a server with a UI-bound clock tool and a client that negotiates Apps, reads the tool's `_meta.ui.resourceUri`, fetches the HTML, and calls the tool. ```bash uv run python -m stories.apps.client ``` # OpenTelemetry Source: https://py.sdk.modelcontextprotocol.io/v2/advanced/opentelemetry/ Your server is already traced. You don't have to add anything. Every server you create emits an [OpenTelemetry](https://opentelemetry.io/) span for every message it handles. You didn't write that, and you don't import it. It is there the moment you call `MCPServer(...)`. ```python title="server.py" # docs_src/opentelemetry/tutorial001.py from mcp.server import MCPServer mcp = MCPServer("Bookshop") @mcp.tool() def search_books(query: str) -> str: """Search the catalog by title or author.""" return f"Found 3 books matching {query!r}." ``` That is a complete, traced server. Call `search_books` and a span is created for it. The same is true for the low-level `Server`: the tracing lives on both. ## What you get Every inbound message becomes a `SERVER` span named after the method and its target. So a `tools/call` for `search_books` is the span `tools/call search_books`, and a bare `tools/list` is just `tools/list`. Each span carries a few attributes: * `mcp.method.name` and `mcp.protocol.version`, on every span. * `jsonrpc.request.id`, on a request (a notification has none). * A handler that raises sets the span status to error. So does a tool result with `is_error=True`. And because tracing a tool call is such a common thing to want, `tools/call` spans speak OpenTelemetry's [GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/): * `gen_ai.operation.name`, set to `"execute_tool"`. * `gen_ai.tool.name`, set to the tool being called. A `prompts/get` span gets `gen_ai.prompt.name` in the same spirit. The list methods carry no `gen_ai.*` keys, because there is nothing to name. !!! tip Those GenAI attributes are the reason a tracing UI groups your tool calls the way it groups any other agent's. You get that grouping for free, with no extra code. ## It costs nothing until you want it Here is the part that makes "on by default" a comfortable default. The SDK depends only on `opentelemetry-api`, the lightweight half of OpenTelemetry. With no SDK and no exporter installed, creating a span is a no-op. So the spans your server is emitting right now cost you almost nothing, and nobody is collecting them. The day you want to *see* them, you install the other half and point it somewhere: ```console uv add opentelemetry-sdk opentelemetry-exporter-otlp ``` Configure an exporter the usual OpenTelemetry way, and every span the SDK has been quietly creating lights up. Your server code does not change. Not one line. !!! info [Pydantic Logfire](https://logfire.pydantic.dev/) is one such backend, and it does the configuration for you: `pip install logfire`, `logfire.configure()`, and your MCP spans show up in the live view. It is built on OpenTelemetry, so anything below applies to it too. ## Traces that cross the wire A trace is most useful when it follows a request from the client into the server, in one connected picture. When the client and the server both run the SDK, that connection is automatic. The client injects the [W3C trace context](https://www.w3.org/TR/trace-context/) into the request, and the server reads it back out, so the server span nests under the client span in the same trace. This is [SEP-414](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/414), and you get it without asking. If the inbound message carries no trace context, for example a request from a client that is not the SDK, the server span simply parents to whatever span is already current on the server, rather than starting a brand-new orphan trace. ## Turning it off Tracing is a middleware, the first one on your server's list. If you really want a server that emits no spans, take it off: ```python from mcp.server._otel import OpenTelemetryMiddleware mcp._lowlevel_server.middleware[:] = [ m for m in mcp._lowlevel_server.middleware if not isinstance(m, OpenTelemetryMiddleware) ] ``` !!! warning That import has a leading underscore, and that is on purpose. The class is provisional, the same way [`Server.middleware`](https://py.sdk.modelcontextprotocol.io/v2/advanced/middleware/index.md) is provisional, so the import path is something you should expect to change. You almost never need this: with no exporter installed the spans are free, so the usual answer is to leave them on and not install an exporter. ## Recap * Every `MCPServer` and every low-level `Server` emits one `SERVER` span per inbound message, out of the box. You write nothing. * Spans carry `mcp.method.name` and `mcp.protocol.version`; `tools/call` and `prompts/get` also carry GenAI attributes so your tool calls group like any other agent's. * It costs nothing until you install an OpenTelemetry SDK and an exporter, and then it lights up with no change to your server. * Client-to-server trace context propagates automatically when both sides run the SDK. Next, the thing that decides whether a request runs at all: **[Authorization](https://py.sdk.modelcontextprotocol.io/v2/advanced/authorization/index.md)**. # Authorization Source: https://py.sdk.modelcontextprotocol.io/v2/advanced/authorization/ Over Streamable HTTP your MCP server is an ordinary web service, and you protect it the way you protect any web service: with OAuth 2.1 bearer tokens. In OAuth terms, your server is a **resource server**. It never signs anyone in and it never issues a token. It does one thing: look at the `Authorization` header on each request and decide whether the token in it is good. ## The three parties * The **authorization server** signs people in and issues access tokens. You don't write this. It's your identity provider (Auth0, Keycloak, Entra, your own). * The **resource server** is your MCP server. It verifies the token on every request. * The **client** discovers which authorization server you trust, gets a token from it, and sends it back to you as `Authorization: Bearer `. That's the whole triangle. Everything on this page is the middle bullet. ## A token verifier The SDK has no opinion about what a valid token looks like. You tell it, by implementing **`TokenVerifier`**: ```python title="server.py" hl_lines="12-14 19-24" # docs_src/authorization/tutorial001.py from pydantic import AnyHttpUrl from mcp.server import MCPServer from mcp.server.auth.provider import AccessToken, TokenVerifier from mcp.server.auth.settings import AuthSettings KNOWN_TOKENS = { "alice-token": AccessToken(token="alice-token", client_id="alice", scopes=["notes:read"]), } class StaticTokenVerifier(TokenVerifier): async def verify_token(self, token: str) -> AccessToken | None: return KNOWN_TOKENS.get(token) mcp = MCPServer( "Notes", token_verifier=StaticTokenVerifier(), auth=AuthSettings( issuer_url=AnyHttpUrl("https://auth.example.com"), resource_server_url=AnyHttpUrl("http://127.0.0.1:8000/mcp"), required_scopes=["notes:read"], ), ) @mcp.tool() def list_notes() -> list[str]: """List every note in the notebook.""" return ["Buy milk", "Ship the release"] ``` * `TokenVerifier` is a protocol with one async method. `verify_token` gets the raw token from the `Authorization` header and returns an **`AccessToken`** if it's valid, `None` if it isn't. There is nothing else to implement. * This one looks the token up in a table. A real one verifies a JWT signature or calls the authorization server's token-introspection endpoint. That code is yours; the SDK only calls it. * `token_verifier=` and `auth=` always travel together. Pass one without the other and `MCPServer(...)` raises a `ValueError` before it ever serves a request. `AuthSettings` is the public face of your resource server: * `issuer_url`: the authorization server that issues your tokens. * `resource_server_url`: the public URL of this MCP endpoint. It names *which* resource a token is for, and it's where the discovery document lives. * `required_scopes`: every token must carry all of them. !!! tip `examples/servers/simple-auth/` in the SDK repository has an `IntrospectionTokenVerifier` that calls a real authorization server's [RFC 7662](https://datatracker.ietf.org/doc/html/rfc7662) endpoint. It's the shape most production verifiers take. ## What you get over HTTP Authorization lives in HTTP headers, so it exists only on the HTTP transports. Run it on the one you deploy: `mcp.run(transport="streamable-http")` puts it on `http://127.0.0.1:8000/mcp`, and **[Running your server](https://py.sdk.modelcontextprotocol.io/v2/run/index.md)** has the rest. The app now has two routes: ```text /mcp /.well-known/oauth-protected-resource/mcp ``` You registered one tool. The second route is the SDK's. ### Discovery `GET` that well-known path and you get **[RFC 9728](https://datatracker.ietf.org/doc/html/rfc9728) Protected Resource Metadata**, built straight from your `AuthSettings`: ```json { "resource": "http://127.0.0.1:8000/mcp", "authorization_servers": ["https://auth.example.com/"], "scopes_supported": ["notes:read"], "bearer_methods_supported": ["header"] } ``` This document is how a client that has never heard of your server finds its way in: it reads `authorization_servers` and goes there for a token. You wrote none of it. !!! check Call `/mcp` with no token (or with one your verifier returned `None` for) and the request is stopped at the door: ```text HTTP/1.1 401 Unauthorized WWW-Authenticate: Bearer error="invalid_token", error_description="Authentication required", resource_metadata="http://127.0.0.1:8000/.well-known/oauth-protected-resource/mcp" {"error": "invalid_token", "error_description": "Authentication required"} ``` Nothing was parsed and no tool ran. And that `resource_metadata` pointer in `WWW-Authenticate` is what makes discovery automatic: 401 -> metadata document -> authorization server -> token -> retry. !!! warning None of this protects `stdio`. A pipe has no `Authorization` header, so `token_verifier` is never consulted there. A `stdio` server's security boundary is the process that launched it. The same goes for the in-memory `Client(mcp)` you use in tests: it connects straight to the server object and skips the HTTP layer, authorization included. ## The caller's identity Inside any handler, **`get_access_token()`** is the `AccessToken` your verifier returned for the current request: ```python title="server.py" hl_lines="4 32-35" # docs_src/authorization/tutorial002.py from pydantic import AnyHttpUrl from mcp.server import MCPServer from mcp.server.auth.middleware.auth_context import get_access_token from mcp.server.auth.provider import AccessToken, TokenVerifier from mcp.server.auth.settings import AuthSettings KNOWN_TOKENS = { "alice-token": AccessToken(token="alice-token", client_id="alice", scopes=["notes:read"]), } class StaticTokenVerifier(TokenVerifier): async def verify_token(self, token: str) -> AccessToken | None: return KNOWN_TOKENS.get(token) mcp = MCPServer( "Notes", token_verifier=StaticTokenVerifier(), auth=AuthSettings( issuer_url=AnyHttpUrl("https://auth.example.com"), resource_server_url=AnyHttpUrl("http://127.0.0.1:8000/mcp"), required_scopes=["notes:read"], ), ) @mcp.tool() def whoami() -> str: """Report which OAuth client is calling.""" token = get_access_token() if token is None: return "anonymous" return f"{token.client_id} (scopes: {', '.join(token.scopes)})" ``` * It works in tools, resources, and prompts, and there is nothing to pass around: the auth middleware stores it in a context variable per request. * You get back the **same object your verifier built**: `client_id`, `scopes`, `subject`, `expires_at`, and any extra `claims` you attached. That's the hook for per-tool rules: read the scopes and refuse. * Outside an authenticated HTTP request it returns `None`. In-memory and over `stdio` it is always `None`. Call `whoami` with `Authorization: Bearer alice-token` and the model reads: ```text alice (scopes: notes:read) ``` ## The half the SDK doesn't do The SDK gives you the resource-server half: verify, advertise, refuse. It does not give you a login page, a consent screen, or a token. To watch all three parties move, run `examples/servers/simple-auth/` from the SDK repository (a small authorization server and a resource server set up exactly like this page) and then point `examples/clients/simple-auth-client/` at it for the full discovery-and-token dance. !!! info There is a second constructor argument, `auth_server_provider=`, that embeds a full authorization server inside your MCP server. It predates the AS/RS separation that the MCP authorization spec is built around. New servers should not reach for it. An authorization server can also accept an enterprise identity provider's signed assertion in place of a user clicking through a consent screen, and the SDK supports both sides of that exchange. The grant, and the client that presents it, is **[Identity assertion](https://py.sdk.modelcontextprotocol.io/v2/advanced/identity-assertion/index.md)**. ## Recap * Over Streamable HTTP your server is an OAuth 2.1 **resource server**: it verifies tokens, it never issues them. * `TokenVerifier` is the whole integration surface: one async method, token in, `AccessToken | None` out. * `token_verifier=` and `auth=AuthSettings(issuer_url=..., resource_server_url=..., required_scopes=[...])` always travel together. * The SDK publishes [RFC 9728](https://datatracker.ietf.org/doc/html/rfc9728) Protected Resource Metadata at `/.well-known/oauth-protected-resource/...` and answers unauthenticated requests with a 401 whose `WWW-Authenticate` header points at it. That is the entire discovery story. * `get_access_token()` in any handler is who's calling. * Authorization is an HTTP concern. `stdio` and the in-memory client never see it. The other side of the handshake, a client that discovers your authorization server and fetches the token for you, is **[OAuth clients](https://py.sdk.modelcontextprotocol.io/v2/advanced/oauth-clients/index.md)**. # OAuth clients Source: https://py.sdk.modelcontextprotocol.io/v2/advanced/oauth-clients/ Some MCP servers are protected. Send them a request without a token and they answer `401 Unauthorized`. **`OAuthClientProvider`** is how you get the token. It is not an MCP object at all. It is an `httpx.Auth`, the standard httpx hook for "do something to every request". You attach it to an `httpx.AsyncClient`, hand that client to the Streamable HTTP transport, and stop thinking about it. This chapter is the client side. Making your own server demand a token is **[Authorization](https://py.sdk.modelcontextprotocol.io/v2/advanced/authorization/index.md)**. ## The provider ```python title="client.py" hl_lines="44-54" # docs_src/oauth_clients/tutorial001.py from urllib.parse import parse_qs, urlparse import httpx from pydantic import AnyUrl from mcp import Client from mcp.client.auth import AuthorizationCodeResult, OAuthClientProvider from mcp.client.streamable_http import streamable_http_client from mcp.shared.auth import OAuthClientInformationFull, OAuthClientMetadata, OAuthToken class InMemoryTokenStorage: def __init__(self) -> None: self.tokens: OAuthToken | None = None self.client_info: OAuthClientInformationFull | None = None async def get_tokens(self) -> OAuthToken | None: return self.tokens async def set_tokens(self, tokens: OAuthToken) -> None: self.tokens = tokens async def get_client_info(self) -> OAuthClientInformationFull | None: return self.client_info async def set_client_info(self, client_info: OAuthClientInformationFull) -> None: self.client_info = client_info async def open_browser(authorization_url: str) -> None: print(f"Visit: {authorization_url}") async def wait_for_callback() -> AuthorizationCodeResult: redirect_url = input("Paste the URL you were redirected to: ") params = parse_qs(urlparse(redirect_url).query) return AuthorizationCodeResult( code=params["code"][0], state=params["state"][0], iss=params["iss"][0] if "iss" in params else None, ) oauth = OAuthClientProvider( server_url="http://localhost:8001/mcp", client_metadata=OAuthClientMetadata( client_name="Bookshop Agent", redirect_uris=[AnyUrl("http://localhost:3030/callback")], scope="user", ), storage=InMemoryTokenStorage(), redirect_handler=open_browser, callback_handler=wait_for_callback, ) async def main() -> None: async with httpx.AsyncClient(auth=oauth, follow_redirects=True) as http_client: transport = streamable_http_client("http://localhost:8001/mcp", http_client=http_client) async with Client(transport) as client: result = await client.list_tools() print([tool.name for tool in result.tools]) ``` You give it four things: * `server_url`: the MCP endpoint you are connecting to. The provider discovers everything else from it. * `client_metadata`: what you would type into an authorization server's "register an application" form. * `storage`: where tokens live between runs. * `redirect_handler` and `callback_handler`: the two moments a human is involved. Nothing else in the file mentions OAuth. `main()` never sees a token. ### Client metadata `OAuthClientMetadata` is the real [RFC 7591](https://datatracker.ietf.org/doc/html/rfc7591) registration document, as a Pydantic model. You set three fields. The defaults fill in the rest: `grant_types` is already `["authorization_code", "refresh_token"]` and `response_types` is already `["code"]`, which is exactly the flow this provider runs. !!! check Because it is a Pydantic model, it validates **before a single byte goes over the network**. Leave out `redirect_uris` and construction fails on the spot with a `ValidationError` that names the field: ```text redirect_uris Field required [type=missing, input_value={'client_name': 'Bookshop Agent'}, input_type=dict] ``` No browser opened, no half-finished registration left behind on the authorization server. ### Token storage **`TokenStorage`** is a `Protocol` with four async methods. You don't inherit from anything; write the methods and any class is a token store: * `get_tokens` / `set_tokens` hold the `OAuthToken`: access token, refresh token, expiry, scope. * `get_client_info` / `set_client_info` hold the `OAuthClientInformationFull` the authorization server issued when the provider registered you, including your `client_id`. The in-memory version above works. It also forgets everything when the process exits, so the next run does the whole dance again. Persist it to a file or your platform's keyring and the next run is silent. !!! tip Store `client_info`, not only the tokens. The provider registers dynamically the first time it finds no stored `client_info`. Throw it away and you mint a fresh registration on every run. ### The two handlers The authorization code flow needs a human exactly once: someone has to sign in and click "allow". * **`redirect_handler`** is awaited with the fully-built authorization URL. The `client_id`, the `redirect_uri`, the `state` and the PKCE challenge are already in it. Your only job is to get a browser there. A desktop app calls `webbrowser.open`; this file prints it. * **`callback_handler`** is awaited next. It waits until the user lands back on your `redirect_uri` and returns that redirect's query parameters as an `AuthorizationCodeResult`. A real client runs a small local HTTP server on the redirect URI instead of calling `input()`. The shape is identical: get redirected, hand back `code`, `state`, and `iss`. !!! warning Pass `state` and `iss` through exactly as they arrived. The provider compares `state` to the one it generated and `iss` to the issuer it discovered, and refuses a mismatch. They are the CSRF and server-mix-up defences. ### Into the `Client` Look at `main()`. The provider goes on the **httpx client**, the httpx client goes into `streamable_http_client(url, http_client=...)`, and that transport goes into `Client`. `streamable_http_client` has no `auth=` keyword. Anything HTTP-level (auth, headers, timeouts, proxies) belongs on the `httpx.AsyncClient` you bring. That layering is **[Client transports](https://py.sdk.modelcontextprotocol.io/v2/client/transports/index.md)**. ## What the provider does for you The first time `Client` sends a request, the server answers `401`. The provider takes over: 1. **Discovery.** It reads the `WWW-Authenticate` header, fetches the server's Protected Resource Metadata from `/.well-known/oauth-protected-resource`, learns which authorization server protects this resource, and fetches *that* server's metadata. 2. **Registration.** Nothing in storage? It registers you dynamically with your `OAuthClientMetadata` and stores the result. 3. **Authorization.** It generates the PKCE pair and a `state`, builds the authorization URL, awaits your `redirect_handler`, then awaits your `callback_handler` for the code. 4. **Exchange.** It trades the code for an `OAuthToken`, stores it, and replays your original request with `Authorization: Bearer ...`. After that it is quiet. Tokens come out of storage, an expired access token is refreshed with the refresh token, and only when none of that works does it run the flow again. You wrote none of it. Three keyword arguments remain (`timeout`, `client_metadata_url` and `validate_resource_url`), and this file needs none of them. ### Try it Everything else in these docs you have checked with an in-memory `Client(server)`. Not this: the whole point of the flow is an HTTP `401`, and there is no HTTP between an in-memory client and its server. The repository ships the live version. `examples/servers/simple-auth/` runs a standalone authorization server and a protected MCP server; `examples/clients/simple-auth-client/` is this chapter's client grown into a small CLI. Its README has the two commands: start the servers, run the client against them, and you watch the four steps go by. ## Machine to machine A nightly job, a CI step, another service. There is no browser and nobody to click "allow". That is the **client credentials** grant: you already hold a `client_id` and a `client_secret`, and the token endpoint is the whole flow. `ClientCredentialsOAuthProvider` is the same `httpx.Auth`, minus the human: ```python title="client.py" hl_lines="4 27-33" # docs_src/oauth_clients/tutorial002.py import httpx from mcp import Client from mcp.client.auth.extensions.client_credentials import ClientCredentialsOAuthProvider from mcp.client.streamable_http import streamable_http_client from mcp.shared.auth import OAuthClientInformationFull, OAuthToken class InMemoryTokenStorage: def __init__(self) -> None: self.tokens: OAuthToken | None = None self.client_info: OAuthClientInformationFull | None = None async def get_tokens(self) -> OAuthToken | None: return self.tokens async def set_tokens(self, tokens: OAuthToken) -> None: self.tokens = tokens async def get_client_info(self) -> OAuthClientInformationFull | None: return self.client_info async def set_client_info(self, client_info: OAuthClientInformationFull) -> None: self.client_info = client_info oauth = ClientCredentialsOAuthProvider( server_url="http://localhost:8001/mcp", storage=InMemoryTokenStorage(), client_id="reporting-agent", client_secret="...", scopes="user", ) async def main() -> None: async with httpx.AsyncClient(auth=oauth, follow_redirects=True) as http_client: transport = streamable_http_client("http://localhost:8001/mcp", http_client=http_client) async with Client(transport) as client: result = await client.list_tools() print([tool.name for tool in result.tools]) ``` What changed: * No `OAuthClientMetadata`, no handlers. You pass `client_id` and `client_secret`; the provider builds a minimal `client_credentials` registration around them and skips dynamic registration entirely. * `scopes` is a space-separated string, the OAuth wire format. * Everything downstream is identical: the same `TokenStorage`, the same `httpx.AsyncClient(auth=...)`, the same `streamable_http_client`. By default the secret travels as HTTP Basic auth on the token request (`client_secret_basic`). Pass `token_endpoint_auth_method="client_secret_post"` to put it in the form body instead. Some authorization servers only accept one of the two. !!! tip Read `client_secret` from the environment or a secret manager, never from source control. !!! info One more provider lives in `mcp.client.auth.extensions.client_credentials`: **`PrivateKeyJWTOAuthProvider`**, for clients that authenticate with a JWT instead of a shared secret (`private_key_jwt`, the key-pair and workload-identity flavour). It follows the same pattern: construct one, put it on `auth=`. The same module ships `SignedJWTParameters` and `static_assertion_provider`, two helpers that build its assertion. There is one more no-human situation: the client belongs to an enterprise whose identity provider, not the user, decides which MCP servers it may reach. That is a different grant with its own trust model and its own chapter, **[Identity assertion](https://py.sdk.modelcontextprotocol.io/v2/advanced/identity-assertion/index.md)**. ## When it fails When the OAuth flow goes wrong, the provider raises an `OAuthFlowError` from `mcp.client.auth`. It has two subclasses. `OAuthRegistrationError` means the authorization server refused to register you. `OAuthTokenError` means the token endpoint said no. One `except OAuthFlowError:` covers discovery, registration, authorization, and exchange. Not everything is a flow error. The network can still fail; those are ordinary `httpx` exceptions and pass through untouched. ## Recap * `OAuthClientProvider` is an `httpx.Auth`. Put it on an `httpx.AsyncClient`, pass that to `streamable_http_client(url, http_client=...)`, and `Client` never knows OAuth happened. * You supply four things: the server URL, an `OAuthClientMetadata`, a `TokenStorage`, and the redirect/callback handler pair. * `TokenStorage` is a `Protocol`: four async methods, no base class. Persist `client_info` as well as the tokens. * Discovery, dynamic registration, PKCE, the `state` and `iss` checks, and token refresh are the provider's job, not yours. * `ClientCredentialsOAuthProvider` is the no-human version: `client_id` + `client_secret`, no handlers, no browser. * Every OAuth failure is an `OAuthFlowError`; `OAuthRegistrationError` and `OAuthTokenError` are its subclasses. The other half of this handshake, making your *server* demand the token, is **[Authorization](https://py.sdk.modelcontextprotocol.io/v2/advanced/authorization/index.md)**. # Identity assertion Source: https://py.sdk.modelcontextprotocol.io/v2/advanced/identity-assertion/ Every provider in **[OAuth clients](https://py.sdk.modelcontextprotocol.io/v2/advanced/oauth-clients/index.md)** starts by asking the MCP server a question: *which authorization server do you trust?* It follows the answer wherever it points, and then either a person signs in or a pre-shared secret stands in for one. An enterprise wants neither decided per server. It already runs an identity provider (Okta, Microsoft Entra ID, your own); the user already signed in to it this morning; and it is the one place the security team wants to decide who may reach what. [SEP-990](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/990), the **Enterprise-Managed Authorization** extension, moves the decision there. The IdP signs a short-lived JWT, an **Identity Assertion JWT Authorization Grant**, the **ID-JAG**: a statement that *this user*, through *this client*, may reach *this MCP server*. The client trades it for an ordinary access token. No browser, no consent screen, no dynamic registration. This chapter is both ends of that trade. The MCP server itself never changes: it is still the resource server from **[Authorization](https://py.sdk.modelcontextprotocol.io/v2/advanced/authorization/index.md)**, checking whatever token shows up. ## Two token requests Two different authorities are in play, and naming them apart is most of understanding this page. The **enterprise IdP** is your organization's identity provider: it knows who the employee is, it is where policy lives, and it issues the ID-JAG. The SDK never talks to it. The **MCP authorization server** is the same party it was in **[Authorization](https://py.sdk.modelcontextprotocol.io/v2/advanced/authorization/index.md)**: the issuer named in the MCP server's metadata, the thing that mints the tokens that MCP server accepts. In the flows you already know, those two roles are usually one box. Here they are two, and the whole grant is the second agreeing to trust the first. The client makes one token request to each. 1. **To the enterprise IdP.** The client trades the user's sign-in (their OpenID Connect ID token) for the ID-JAG. This is an [RFC 8693](https://datatracker.ietf.org/doc/html/rfc8693) token exchange, it is entirely your IdP's API, and **the SDK does not make it**. You do, inside one async callback. It is also where the policy decision happens: an IdP that says no never issues the ID-JAG, and there is nothing to present. 2. **To the MCP authorization server.** The client presents the ID-JAG under the [RFC 7523](https://datatracker.ietf.org/doc/html/rfc7523) `jwt-bearer` grant (`grant_type=urn:ietf:params:oauth:grant-type:jwt-bearer`, the ID-JAG as `assertion`) and receives the access token. **This is the request the SDK makes**, and accepting it is the one thing this page adds to an authorization server. Everything below is the second request: the client that sends it and the authorization server that answers it. ## The client **`IdentityAssertionOAuthProvider`** lives in `mcp.client.auth.extensions.identity_assertion`. Like every provider in **[OAuth clients](https://py.sdk.modelcontextprotocol.io/v2/advanced/oauth-clients/index.md)** it is an `httpx.Auth`: construct one, put it on `auth=`, hand the `httpx.AsyncClient` to the transport. ```python title="client.py" hl_lines="49-50 53-61" # docs_src/identity_assertion/tutorial001.py import time import uuid import httpx import jwt from mcp import Client from mcp.client.auth.extensions.identity_assertion import IdentityAssertionOAuthProvider from mcp.client.streamable_http import streamable_http_client from mcp.shared.auth import OAuthClientInformationFull, OAuthToken IDP_SIGNING_KEY = "the-enterprise-idp-signing-key" class InMemoryTokenStorage: def __init__(self) -> None: self.tokens: OAuthToken | None = None self.client_info: OAuthClientInformationFull | None = None async def get_tokens(self) -> OAuthToken | None: return self.tokens async def set_tokens(self, tokens: OAuthToken) -> None: self.tokens = tokens async def get_client_info(self) -> OAuthClientInformationFull | None: return self.client_info async def set_client_info(self, client_info: OAuthClientInformationFull) -> None: self.client_info = client_info def idp_issue_id_jag(subject: str, audience: str, resource: str) -> str: now = int(time.time()) claims = { "iss": "https://idp.example.com", "sub": subject, "aud": audience, "client_id": "finance-agent", "resource": resource, "scope": "notes:read", "jti": str(uuid.uuid4()), "iat": now, "exp": now + 300, } return jwt.encode(claims, IDP_SIGNING_KEY, algorithm="HS256", headers={"typ": "oauth-id-jag+jwt"}) async def fetch_id_jag(audience: str, resource: str) -> str: return idp_issue_id_jag("alice@example.com", audience, resource) oauth = IdentityAssertionOAuthProvider( server_url="http://localhost:8001/mcp", storage=InMemoryTokenStorage(), client_id="finance-agent", client_secret="finance-agent-secret", issuer="https://auth.example.com/", assertion_provider=fetch_id_jag, scope="notes:read", ) async def main() -> None: async with httpx.AsyncClient(auth=oauth, follow_redirects=True) as http_client: transport = streamable_http_client("http://localhost:8001/mcp", http_client=http_client) async with Client(transport) as client: result = await client.list_tools() print([tool.name for tool in result.tools]) ``` Read it from the bottom. * `main()` is the `main()` from **[OAuth clients](https://py.sdk.modelcontextprotocol.io/v2/advanced/oauth-clients/index.md)**, line for line. That is the point: once the provider exists, nothing downstream knows which grant produced the token. * The provider takes what the other providers cannot discover: a `client_id` and `client_secret` somebody **pre-registered** with the authorization server, that authorization server's `issuer`, and `assertion_provider`, an async callback that returns a fresh ID-JAG on demand. * `storage` is the same `TokenStorage` protocol. Only the two token methods are ever called; there is no dynamic registration here, so there is no `client_info` to remember. ### The assertion provider `fetch_id_jag(audience, resource)` is the only code you write. It is awaited once per token exchange, never at construction, and only *after* the authorization server's metadata has been fetched and validated, so a misconfigured issuer never leaks an assertion. Its two arguments are two of the claims the ID-JAG must be minted with: `audience` is the authorization server's issuer (the ID-JAG `aud`) and `resource` is the MCP server's canonical identifier (the ID-JAG `resource`). The third is one you already hold: the ID-JAG's `client_id` claim must name the `client_id` you gave the provider, or the authorization server refuses the exchange. `idp_issue_id_jag` above it is **not your code**. It stands in for the identity provider, signing the assertion in-process so the file is complete and you can read every claim an ID-JAG carries. A real `fetch_id_jag` makes the first token request of the previous section instead: an [RFC 8693](https://datatracker.ietf.org/doc/html/rfc8693) token exchange against your IdP, defined by the Identity Assertion JWT Authorization Grant draft that [SEP-990](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/990) profiles. The signed-in user's ID token goes in as the `subject_token`, the `requested_token_type` is the ID-JAG's own URN (`urn:ietf:params:oauth:token-type:id-jag`), `audience` and `resource` pass straight through, and the response carries the ID-JAG. That exchange, under those names, is what to look for in your IdP's documentation. !!! tip A fresh ID-JAG is requested for every exchange, and that is the point: it is a single-use, minutes-lived grant, and the authorization server on this page refuses to accept the same one twice. Do not cache it. The access token it buys you is the thing that gets reused. ### The issuer is configuration Here is the inversion. `OAuthClientProvider` asks the resource server which authorization server to use and follows the answer wherever it points. This provider refuses to: `issuer` is required, the [RFC 8414](https://datatracker.ietf.org/doc/html/rfc8414) metadata is fetched from that issuer's own well-known path, the token endpoint must be on that issuer's origin, and the resource server is never asked anything. The extension does not demand this; it is a deliberately stricter choice. This client carries two things worth stealing, a pre-registered secret and an audience-bound assertion, and a client that let a compromised MCP server steer it to an attacker's authorization server would post both to it. Pinning the issuer at construction deletes that conversation. !!! warning The configured `issuer` is compared to the metadata document's `issuer` field by RFC 8414 §3.3 simple string comparison: character for character, trailing slash included, no normalization. Do not guess it. Fetch `/.well-known/oauth-authorization-server` from your authorization server and copy the `issuer` value it returns. For the authorization server on this page that is `https://auth.example.com/`, with the slash, because its issuer was built from a pydantic URL object. A mismatch stops the flow at `OAuthFlowError: Authorization server metadata issuer mismatch` before a single credential or assertion is sent. ### A confidential client `client_secret` is required; the constructor raises `ValueError` without one. The IETF profile underneath [SEP-990](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/990) reserves this grant for confidential clients, SEP-990 requires the client to authenticate, and this SDK enforces both by insisting on a shared secret. `token_endpoint_auth_method` picks where it travels: `client_secret_post` (the default, in the form body) or `client_secret_basic` (an HTTP Basic header). The profile also permits `private_key_jwt`; this provider does not support it. !!! tip Read `client_secret` from the environment or a secret manager, never from source control. ### What the provider does for you The first request goes out unauthenticated, and the server's `401` starts the flow. 1. **Discovery.** It fetches the authorization server metadata from the configured issuer's [RFC 8414](https://datatracker.ietf.org/doc/html/rfc8414) well-known path, checks the document's `issuer` matches, and checks the token endpoint is on the issuer's origin. 2. **The assertion.** It awaits your `assertion_provider`. 3. **Exchange.** It POSTs the `jwt-bearer` grant to the token endpoint, stores the `OAuthToken`, and replays your original request with `Authorization: Bearer ...`. A `403` whose `WWW-Authenticate` names `insufficient_scope` runs steps 2 and 3 again with the union of your `scope` and the challenged one. (`scope` is only ever a request; this page's authorization server grants what the ID-JAG says and nothing else.) There is no refresh token anywhere in this: when the access token expires, the next `401` mints a fresh ID-JAG and exchanges again, and *that* is the lever the IdP holds. Failures are the same two exceptions as the rest of **[OAuth clients](https://py.sdk.modelcontextprotocol.io/v2/advanced/oauth-clients/index.md)**: `OAuthFlowError` for discovery and validation, its subclass `OAuthTokenError` when the token endpoint says no. ## The authorization server Most of the time you stop here. The MCP authorization server is somebody else's product, accepting ID-JAGs is its configuration to turn on, and the SDK's half of [SEP-990](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/990) is the client above. The SDK can also *be* the authorization server: `create_auth_routes` returns the authorization server's routes as a list any Starlette app can mount, which is how `examples/servers/simple-auth/` in the repository runs one. SEP-990 adds one flag and one method to that surface: ```python title="auth_server.py" hl_lines="48-50 105-107" # docs_src/identity_assertion/tutorial002.py import secrets import time import jwt from pydantic import AnyHttpUrl from starlette.applications import Starlette from mcp.server.auth.provider import ( AccessToken, AuthorizationCode, AuthorizationParams, AuthorizeError, IdentityAssertionParams, OAuthAuthorizationServerProvider, RefreshToken, TokenError, ) from mcp.server.auth.routes import create_auth_routes from mcp.shared.auth import JWT_BEARER_GRANT_TYPE, OAuthClientInformationFull, OAuthToken ISSUER = "https://auth.example.com/" MCP_SERVER = "http://localhost:8001/mcp" IDP_ISSUER = "https://idp.example.com" IDP_SIGNING_KEY = "the-enterprise-idp-signing-key" REGISTERED_CLIENTS = { "finance-agent": OAuthClientInformationFull( client_id="finance-agent", client_secret="finance-agent-secret", redirect_uris=None, grant_types=[JWT_BEARER_GRANT_TYPE], token_endpoint_auth_method="client_secret_post", ) } class EnterpriseAuthorizationServer(OAuthAuthorizationServerProvider[AuthorizationCode, RefreshToken, AccessToken]): def __init__(self) -> None: self.access_tokens: dict[str, AccessToken] = {} self.seen_jtis: set[str] = set() async def get_client(self, client_id: str) -> OAuthClientInformationFull | None: return REGISTERED_CLIENTS.get(client_id) async def load_access_token(self, token: str) -> AccessToken | None: return self.access_tokens.get(token) async def exchange_identity_assertion( self, client: OAuthClientInformationFull, params: IdentityAssertionParams ) -> OAuthToken: try: header = jwt.get_unverified_header(params.assertion) claims = jwt.decode( params.assertion, IDP_SIGNING_KEY, algorithms=["HS256"], issuer=IDP_ISSUER, audience=ISSUER, options={"require": ["iss", "sub", "aud", "exp", "iat", "jti", "client_id", "resource", "scope"]}, ) except jwt.InvalidTokenError as error: raise TokenError("invalid_grant", "the assertion did not verify") from error if header.get("typ") != "oauth-id-jag+jwt": raise TokenError("invalid_grant", "the assertion is not an ID-JAG") if claims["client_id"] != client.client_id: raise TokenError("invalid_grant", "the assertion was issued to a different client") if claims["resource"] != MCP_SERVER: raise TokenError("invalid_target", "the assertion is for a resource this server does not serve") if claims["jti"] in self.seen_jtis: raise TokenError("invalid_grant", "the assertion has already been used") self.seen_jtis.add(claims["jti"]) scopes = claims["scope"].split() access_token = f"mcp_{secrets.token_hex(16)}" self.access_tokens[access_token] = AccessToken( token=access_token, client_id=claims["client_id"], scopes=scopes, expires_at=int(time.time()) + 300, resource=claims["resource"], subject=claims["sub"], ) return OAuthToken(access_token=access_token, token_type="Bearer", expires_in=300, scope=" ".join(scopes)) async def authorize(self, client: OAuthClientInformationFull, params: AuthorizationParams) -> str: raise AuthorizeError("unauthorized_client", "this authorization server only accepts ID-JAGs") async def load_authorization_code(self, client: OAuthClientInformationFull, authorization_code: str) -> None: return None async def exchange_authorization_code( self, client: OAuthClientInformationFull, authorization_code: AuthorizationCode ) -> OAuthToken: raise TokenError("invalid_grant", "this authorization server only accepts ID-JAGs") async def load_refresh_token(self, client: OAuthClientInformationFull, refresh_token: str) -> None: return None async def exchange_refresh_token( self, client: OAuthClientInformationFull, refresh_token: RefreshToken, scopes: list[str] ) -> OAuthToken: raise TokenError("invalid_grant", "this authorization server only accepts ID-JAGs") provider = EnterpriseAuthorizationServer() auth_app = Starlette( routes=create_auth_routes(provider, issuer_url=AnyHttpUrl(ISSUER), identity_assertion_enabled=True) ) ``` * `identity_assertion_enabled=True` gates everything. Off, which is the default, `/token` answers this grant with `unsupported_grant_type` even if you implemented the hook, and the metadata does not mention it. On, the metadata gains the `jwt-bearer` grant type and lists `urn:ietf:params:oauth:grant-profile:id-jag` in `authorization_grant_profiles_supported`, the field the extension uses to advertise support. (This SDK's client never reads it: it is provisioned for one issuer and simply asks.) * **`exchange_identity_assertion`** is the hook. Before it runs, the SDK has authenticated the client, refused public clients, and refused clients whose registration does not list the grant. You get an `IdentityAssertionParams` (the raw `assertion`, the requested `scopes` and `resource`) and return a plain `OAuthToken`. * Dynamic client registration refuses this grant unconditionally, so `get_client` here serves a hand-provisioned client. An ID-JAG client cannot register itself into existence. * Half the class is refusals. `OAuthAuthorizationServerProvider` is the *whole* authorization server, so it also asks for the authorization-code flow; a server that signs users in as well implements those for real, and this one has exactly one door. !!! warning The SDK never decodes the assertion: only your deployment knows which IdP it trusts and which keys that IdP publishes, so everything inside `exchange_identity_assertion` is load-bearing. Verify the signature against the IdP's published keys (its JWKS; the shared secret here is the demo's), and `iss` and `exp`, per [RFC 7523](https://datatracker.ietf.org/doc/html/rfc7523) §3. Require the JWT header's `typ` to be `oauth-id-jag+jwt`, the profile's guard against some other JWT being replayed as a grant. Require `aud` to be your own issuer. Require the ID-JAG's `client_id` claim to equal the client the handler authenticated, and its `resource` claim to name a resource you actually serve. Track `jti` until the assertion's `exp` so it is accepted once. And take the granted scopes and, above all, the issued token's `resource` from the validated ID-JAG, never from the request: `params.resource` is whatever the client typed. The full processing rules are in the [Enterprise-Managed Authorization specification](https://modelcontextprotocol.io/extensions/auth/enterprise-managed-authorization). Reject a bad assertion with `TokenError("invalid_grant", ...)`. The other error code in this flow is `invalid_target`: an ID-JAG that names a resource you do not serve is refused with it, which is what stops this server minting tokens for somebody else's. And the granted scopes come from the ID-JAG's `scope` claim (an assertion without one is refused too); yours might map the user's groups instead. And notice what the returned `OAuthToken` does not carry: a refresh token. The IdP decides how long this user keeps access by deciding whether to issue the next ID-JAG. A refresh token minted here would quietly hand that decision back. !!! info A server that still embeds its authorization server with `auth_server_provider=` reaches the same code through `AuthSettings(identity_assertion_enabled=True)`. **[Authorization](https://py.sdk.modelcontextprotocol.io/v2/advanced/authorization/index.md)** explains why new servers should not start there. !!! check Wire the two files on this page together and the whole grant is one `POST /token`: ```text grant_type=urn:ietf:params:oauth:grant-type:jwt-bearer assertion=eyJhbGciOiJIUzI1NiIsInR5cCI6Im9hdXRoLWlkLWphZytqd3QifQ... client_id=finance-agent resource=http://localhost:8001/mcp scope=notes:read client_secret=finance-agent-secret HTTP/1.1 200 OK {"access_token": "mcp_...", "token_type": "Bearer", "expires_in": 300, "scope": "notes:read"} ``` No `/authorize`, no `/register`, no protected-resource-metadata fetch. The only requests on the wire are the one that drew the `401`, the well-known fetch, this exchange, and then ordinary MCP traffic with the bearer attached. And the `sub` your validator read out of the ID-JAG is exactly what `get_access_token().subject` reports inside a tool. ### Try it `examples/stories/identity_assertion/` in the SDK repository is this page running for real: the same `exchange_identity_assertion` validator, an MCP server gated on its tokens, a stand-in IdP, and the client, in one self-checking program. `uv run python -m stories.identity_assertion.client --http` runs the whole exchange and asserts that the user the IdP named is the user the tool sees. ## Recap * [SEP-990](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/990) lets the enterprise identity provider, not the end user, decide which MCP servers a client may reach. The IdP signs that decision into an **ID-JAG**. * Obtaining the ID-JAG is an [RFC 8693](https://datatracker.ietf.org/doc/html/rfc8693) token exchange against *your IdP*, and the SDK does not make it. Presenting it to the MCP authorization server is the [RFC 7523](https://datatracker.ietf.org/doc/html/rfc7523) `jwt-bearer` grant, and the SDK does both sides of that. * `IdentityAssertionOAuthProvider` is another `httpx.Auth`: a pre-registered confidential client, a pinned `issuer`, and one `assertion_provider(audience, resource)` callback. No browser, no registration, no refresh token. * The authorization server is never discovered from the resource server. Configure `issuer` to exactly the string its metadata document serves; the comparison is character for character. * Server side, `identity_assertion_enabled=True` plus `exchange_identity_assertion`. The SDK authenticates the client and gates the grant; validating the ID-JAG is entirely yours, and the issued token is bound to the ID-JAG's `resource`, not the request's. The one party this page never touched is the MCP server. What it does with the token you just minted, it was already doing in **[Authorization](https://py.sdk.modelcontextprotocol.io/v2/advanced/authorization/index.md)**. # Session groups Source: https://py.sdk.modelcontextprotocol.io/v2/advanced/session-groups/ A `Client` connects to one server. Real applications often want several (a search server, a database server, an internal API) and end up juggling a connection and a tool list for each. **`ClientSessionGroup`** is one object that holds many connections and merges everything they expose into a single view. ## Two servers Start with two ordinary servers. They have nothing to do with each other, so both naturally called their tool `search`: ```python title="library_server.py" hl_lines="7" # docs_src/session_groups/tutorial001.py from mcp.server import MCPServer mcp = MCPServer("Library") @mcp.tool() def search(query: str) -> str: """Search the library catalog.""" return f"3 books match {query!r}." @mcp.resource("library://hours") def hours() -> str: """When the library is open.""" return "Mon-Fri 09:00-17:00" ``` ```python title="web_server.py" hl_lines="7" # docs_src/session_groups/tutorial002.py from mcp.server import MCPServer mcp = MCPServer("Web") @mcp.tool() def search(query: str) -> str: """Search the web.""" return f"12 pages match {query!r}." ``` ## One group Create a `ClientSessionGroup` and call **`connect_to_server`** once per server: ```python title="client.py" hl_lines="10-12" # docs_src/session_groups/tutorial003.py import asyncio from mcp import ClientSessionGroup, StdioServerParameters async def main() -> None: library = StdioServerParameters(command="uv", args=["run", "mcp", "run", "library_server.py"]) web = StdioServerParameters(command="uv", args=["run", "mcp", "run", "web_server.py"]) async with ClientSessionGroup() as group: await group.connect_to_server(library) await group.connect_to_server(web) result = await group.call_tool("search", {"query": "model context protocol"}) print(result.structured_content) if __name__ == "__main__": asyncio.run(main()) ``` * `connect_to_server` takes transport parameters, not a server object: `StdioServerParameters` (from `mcp`) to launch a subprocess, or `StreamableHttpParameters` / `SseServerParameters` (from `mcp.client.session_group`) for a server already listening on a URL. * `group.tools` is a `dict[str, Tool]` of every connected server's tools. `group.resources` and `group.prompts` are the same shape. * `group.call_tool(name, arguments)` looks the name up, finds the session that owns it, and forwards the call. You never say which server. !!! check Put `client.py` next to the two servers and run it. The second `connect_to_server` refuses: ```text mcp.shared.exceptions.MCPError: {'search'} already exist in group tools. ``` That is an `MCPError`, raised before anything from the second server is registered. A name must be unique across the **whole** group, and two servers you don't control will collide eventually. ## `component_name_hook` You fix this at the group, not at the servers. Pass a function of `(name, server_info)` and the group runs it on every name it registers: ```python title="client.py" hl_lines="8-9 16" # docs_src/session_groups/tutorial004.py import asyncio from mcp_types import Implementation from mcp import ClientSessionGroup, StdioServerParameters def by_server(name: str, server_info: Implementation) -> str: return f"{server_info.name}.{name}" async def main() -> None: library = StdioServerParameters(command="uv", args=["run", "mcp", "run", "library_server.py"]) web = StdioServerParameters(command="uv", args=["run", "mcp", "run", "web_server.py"]) async with ClientSessionGroup(component_name_hook=by_server) as group: await group.connect_to_server(library) await group.connect_to_server(web) print(sorted(group.tools)) result = await group.call_tool("Web.search", {"query": "model context protocol"}) print(result.structured_content) if __name__ == "__main__": asyncio.run(main()) ``` Run it again. `print(sorted(group.tools))` now shows both: ```text ['Library.search', 'Web.search'] ``` * The **key** is yours. `by_server` built it from `server_info.name`, the name each `MCPServer(...)` was constructed with. * The `Tool` inside is untouched: `group.tools["Web.search"].name` is still `"search"`, and that is the name `call_tool` puts on the wire. The prefix never leaves your process. * It is not only tools. The library's `hours` resource is registered as `Library.hours`. !!! tip The hook runs on **every** name from **every** server, not only on conflicts: there is no prefix-on-collision mode. Pick one scheme and let it apply everywhere. ## Adding and removing servers `connect_to_server` returns the `ClientSession` it opened. Keep it if you ever want that server gone: `await group.disconnect_from_server(session)` removes its tools, resources, and prompts from the group. If you already hold a connected `ClientSession` (`Client.session` is one), hand it to `await group.connect_with_session(server_info, session)` instead of opening a new transport. It aggregates the same way. The group never closes a session it didn't open. ## The classic handshake `ClientSessionGroup` is built on `ClientSession`, not on `Client`. Each `connect_to_server` runs the classic `initialize` handshake. It never sends the `server/discover` probe described in **[Protocol versions](https://py.sdk.modelcontextprotocol.io/v2/client/protocol-versions/index.md)**. Every MCP server understands that handshake, so this costs you compatibility with nothing; it only means a group takes the older, slower path to a server that could do better. ## Recap * `ClientSessionGroup` holds many server connections and merges their tools, resources, and prompts into one `dict` each. * `connect_to_server(params)` per server. It takes transport parameters, never the server object or URL a `Client` takes. * `group.call_tool(name, arguments)` routes to the owning server for you. * Names must be unique across the whole group; two servers with a `search` tool cannot coexist on their own. * `component_name_hook=` rewrites every registered name. The dict key changes, the wire name does not. * `connect_with_session` adds a session you already hold; `disconnect_from_server` removes one. The handshake a group speaks (and the faster one a `Client` prefers) is the subject of **[Protocol versions](https://py.sdk.modelcontextprotocol.io/v2/client/protocol-versions/index.md)**. # Deprecated features Source: https://py.sdk.modelcontextprotocol.io/v2/advanced/deprecated/ The 2026-07-28 spec retires five things. The SDK still implements every one of them, and every one of them now carries a **deprecation warning**. The table below names each deprecated feature, why it is going away, and the replacement to build on. ## What is deprecated | Deprecated | Why | What you do instead | |---|---|---| | **Roots**: `ctx.session.list_roots()`, `client.send_roots_list_changed()`, the `list_roots_callback=` you pass to `Client(...)` | [SEP-2577](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2577) retires the capability. | Take the paths as ordinary tool arguments or resource URIs, or embed a `ListRootsRequest` in an `InputRequiredResult` (see **[Multi-round-trip requests](https://py.sdk.modelcontextprotocol.io/v2/advanced/multi-round-trip/index.md)**). | | **Server-initiated sampling**: `ctx.session.create_message()`, the `sampling_callback=` you pass to `Client(...)` | SEP-2577 retires the capability. | Return `InputRequiredResult` and let the client retry the call (see **[Multi-round-trip requests](https://py.sdk.modelcontextprotocol.io/v2/advanced/multi-round-trip/index.md)**). | | **Protocol logging**: `ctx.log()`, `ctx.debug()`, `ctx.info()`, `ctx.warning()`, `ctx.error()`, `ctx.session.send_log_message()`, `client.set_logging_level()` | SEP-2577 retires the capability. Nothing in-protocol replaces it. | Ordinary `import logging` to stderr (see **[Logging](https://py.sdk.modelcontextprotocol.io/v2/tutorial/logging/index.md)**). | | **`ping`**: `client.send_ping()` | **Removed** from the protocol, not merely deprecated. There is no `ping` method in 2026-07-28. | Nothing. It only works against a `mode="legacy"` connection. | | **Client->server progress**: `client.send_progress_notification()` | 2026-07-28 makes progress server->client only. | Nothing to send. Your *server* reports progress with `ctx.report_progress()` (see **[Progress](https://py.sdk.modelcontextprotocol.io/v2/tutorial/progress/index.md)**). | Three things fall out of that table: * Roots, sampling, and logging go together. One proposal, **SEP-2577**, deprecates all three capabilities at once. * Sampling and roots share a deeper problem: they are places a **server** sends a **request** to the **client**. That whole direction is what 2026-07-28 replaces with **[Multi-round-trip requests](https://py.sdk.modelcontextprotocol.io/v2/advanced/multi-round-trip/index.md)**. It is the standalone RPC methods (`sampling/createMessage`, `roots/list`, and push-style `elicitation/create`) that are gone; the `CreateMessageRequest` / `ListRootsRequest` / `ElicitRequest` payload types survive, embedded in `InputRequiredResult.input_requests`, and on the client they hit the same callbacks. * `ping` is the odd one out. The protocol does not deprecate it, it removes it. The SDK method still warns (its message says *removed*, not *deprecated*) and calling it on a modern connection answers with *"Method not found"*. ## Deprecated is advisory Nothing breaks today. Every method above keeps working against any session that negotiated **2025-11-25 or earlier**. Pin `mode="legacy"` on the client and you get exactly the pre-2026 behaviour. There are no wire changes and capability negotiation is unchanged. What changes is that you get a visible warning the first time each one runs: ```text MCPDeprecationWarning: The logging capability is deprecated as of 2026-07-28 (SEP-2577). ``` `MCPDeprecationWarning` subclasses `UserWarning`, **not** `DeprecationWarning`. That is deliberate: Python's default filter only shows `DeprecationWarning` in code run directly as `__main__`, which is how libraries deprecate things and nobody notices for two years. This one shows up everywhere, with no `-W` flag. !!! warning "Advisory" stops at the wire. Sampling and roots are server-to-client *requests*, and a 2026-07-28 session has no channel to carry one. Call `ctx.session.create_message()` inside a tool on a modern connection and the warning still fires, and then the send fails with an error: ```text Cannot send 'sampling/createMessage': this transport context has no back-channel for server-initiated requests. ``` Two signals, in that order. The `MCPDeprecationWarning` fires the moment you call the method, on any connection. The error is what comes back when the SDK then tries to send. These two only work end-to-end on a `mode="legacy"` connection whose client registered the matching callback. ## Silencing the warning Don't, in new code. But a server you maintain that genuinely serves pre-2026 clients has every right to a quiet log. Filter the category before the first deprecated call runs: ```python import warnings from mcp import MCPDeprecationWarning warnings.filterwarnings("ignore", category=MCPDeprecationWarning) ``` That is the whole API. There is no per-method switch, and you don't want one: the point of one category is that one line silences it and one line brings it back. !!! check Run the filter the other way and you get a free regression test. Add `"error::mcp.MCPDeprecationWarning"` to the `filterwarnings` setting in your pytest configuration and the deprecated call **raises** instead of warning. A tool named `old_log` that still calls `ctx.info()` stops passing and starts reporting: ```text Error executing tool old_log: The logging capability is deprecated as of 2026-07-28 (SEP-2577). ``` One line of pytest configuration, and a deprecated call can never sneak back into your codebase without failing a test. ## Recap * The 2026-07-28 spec deprecates **roots**, server-initiated **sampling**, and protocol **logging** (all [SEP-2577](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2577)), restricts **progress** to server-to-client, and removes **`ping`**. * The replacement column points you onward: **[Multi-round-trip requests](https://py.sdk.modelcontextprotocol.io/v2/advanced/multi-round-trip/index.md)** for sampling and roots, **[Logging](https://py.sdk.modelcontextprotocol.io/v2/tutorial/logging/index.md)** for logging, **[Progress](https://py.sdk.modelcontextprotocol.io/v2/tutorial/progress/index.md)** for progress. `ping` needs nothing at all. * Deprecated is advisory: no wire changes, everything keeps working against pre-2026 sessions, and you get a visible `MCPDeprecationWarning` (a `UserWarning`, so it is on by default). * Sampling and roots additionally need a back-channel that a 2026-07-28 session does not have. On a modern connection they warn and then they raise. * `warnings.filterwarnings("ignore", category=MCPDeprecationWarning)` silences the whole category; `"error::mcp.MCPDeprecationWarning"` in pytest turns it into a test failure. * New code should not be built on any of these. Every other page in these docs teaches the current API.