Multi-round-trip requests
Sometimes a tool can't finish in one round trip. It needs something only the user has: a choice, a confirmation, a credential.
Before 2026-07-28 the server got it by calling back: opening its own request to the client (an elicitation, a sampling call) in the middle of handling the original one. The 2026-07-28 spec retires that back-channel.
Instead, the server returns.
Return, don't call back
The server answers tools/call with an InputRequiredResult instead of a CallToolResult. Two of its fields do the work:
input_requests: what the server still needs, as a dict keyed by names the server chose. Each value is anElicitRequest, aCreateMessageRequest, or aListRootsRequest.request_state: an opaque token. The client echoes it back verbatim on the retry. Your server is the only thing that reads it.
The client fulfils each request, then calls the same tool again, carrying its answers in input_responses and the token in request_state. The server now has what it was missing and returns a normal CallToolResult.
That's the whole protocol. Every leg is an ordinary request from the client to the server. Nothing ever flows the other way.
The server side
The high-level @mcp.tool() decorator has no sugar for this yet. Today you write it on the low-level Server, whose on_call_tool handler is allowed to return either result type:
from mcp_types import (
CallToolRequestParams,
CallToolResult,
ElicitRequest,
ElicitRequestFormParams,
ElicitResult,
InputRequiredResult,
ListToolsResult,
PaginatedRequestParams,
TextContent,
Tool,
)
from mcp.server import Server, ServerRequestContext
ASK_REGION = ElicitRequest(
params=ElicitRequestFormParams(
message="Which region should the database live in?",
requested_schema={
"type": "object",
"properties": {"region": {"type": "string"}},
"required": ["region"],
},
)
)
async def list_tools(ctx: ServerRequestContext, params: PaginatedRequestParams | None) -> ListToolsResult:
return ListToolsResult(
tools=[
Tool(
name="provision",
description="Provision a database. Asks which region to put it in.",
input_schema={
"type": "object",
"properties": {"name": {"type": "string"}},
"required": ["name"],
},
)
]
)
async def call_tool(ctx: ServerRequestContext, params: CallToolRequestParams) -> CallToolResult | InputRequiredResult:
answer = (params.input_responses or {}).get("region")
if not isinstance(answer, ElicitResult) or answer.content is None:
return InputRequiredResult(input_requests={"region": ASK_REGION}, request_state="provision-v1")
name = (params.arguments or {})["name"]
text = f"Provisioned {name!r} in {answer.content['region']}."
return CallToolResult(content=[TextContent(type="text", text=text)])
server = Server("Provisioner", on_list_tools=list_tools, on_call_tool=call_tool)
on_call_toolis typed-> CallToolResult | InputRequiredResult. Returning the second one is the entire server-side API.- On the first call
params.input_responsesisNone, so the guard fires and the handler asks instead of answering. - On the retry, the
ElicitResultthe client sent is sitting under the same key ("region") that the server used ininput_requests.
Everything else in that file (the explicit input_schema, the hand-built CallToolResult) is the ordinary low-level Server, covered in The low-level Server. This page only adds the second return type.
The client side
Client runs the loop for you.
Register the callbacks the server might ask for (elicitation_callback, sampling_callback, list_roots_callback) and call the tool. When an InputRequiredResult arrives, Client dispatches each entry in input_requests to the matching callback, retries with the answers and the echoed request_state, and keeps going until a CallToolResult comes back:
from mcp_types import ElicitRequestParams, ElicitResult
from mcp import Client
from mcp.client import ClientRequestContext
async def handle_elicitation(context: ClientRequestContext, params: ElicitRequestParams) -> ElicitResult:
return ElicitResult(action="accept", content={"region": "eu-west-1"})
async def main() -> None:
async with Client("http://127.0.0.1:8000/mcp", elicitation_callback=handle_elicitation) as client:
result = await client.call_tool("provision", {"name": "orders"})
print(result.content)
- That
elicitation_callbackis the same one a pre-2026 server's back-channelelicitation/createwould have hit. The same is true ofsampling_callbackforsampling/createMessageandlist_roots_callbackforroots/list: at 2026-07-28 the standalone server->client RPCs are gone, but the identicalElicitRequest/CreateMessageRequest/ListRootsRequestpayloads ride insideinput_requestsand dispatch to the same three callbacks. One set of callbacks serves both eras. call_toolreturns a plainCallToolResult. The intermediate rounds are invisible to the caller.get_promptandread_resourcedrive the same loop.
Check
Leave the callback off and the loop fails on the first round: the SDK's stand-in callback
answers every elicitation with an error, and call_tool raises MCPError with the message
"Elicitation not supported".
The loop is bounded. Client(..., input_required_max_rounds=10) is the default cap; a server that keeps returning InputRequiredResult past it makes call_tool raise. If a round carries only request_state and no input_requests, Client sleeps briefly (50ms doubling to a 250ms ceiling) before retrying, so a server that is just saying "not done yet" isn't busy-polled.
Driving the loop yourself
The auto-loop is enough for a single-process client. Own the loop instead when:
- Your client is distributed: the process that renders the question to the user is not the process that called
call_tool, so a different worker issues the retry.request_stateis the persistable token you carry across that boundary, through your own storage, andinput_responsesis what the other side sends back with it. - You want to inspect each round: log or audit every
input_requestsentry, refuse certain request kinds, or apply your own backoff between legs. - You want a wall-clock bound rather than a round-count bound: wrap your own loop in
anyio.fail_after(...)instead of relying oninput_required_max_rounds.
Drop to the underlying session, where allow_input_required=True hands you the union directly:
from mcp_types import CallToolResult, ElicitRequest, ElicitResult, InputRequest, InputRequiredResult, InputResponse
from mcp import Client
def fulfil(request: InputRequest) -> InputResponse:
if not isinstance(request, ElicitRequest):
raise NotImplementedError(f"this client cannot answer a {request.method!r} request")
return ElicitResult(action="accept", content={"region": "eu-west-1"})
async def provision(client: Client, name: str) -> CallToolResult:
result = await client.session.call_tool("provision", {"name": name}, allow_input_required=True)
while isinstance(result, InputRequiredResult):
responses = {key: fulfil(request) for key, request in (result.input_requests or {}).items()}
result = await client.session.call_tool(
"provision",
{"name": name},
input_responses=responses,
request_state=result.request_state,
allow_input_required=True,
)
return result
client.session.call_tool(..., allow_input_required=True)widens the return type toCallToolResult | InputRequiredResult. Theisinstanceis what narrows it back.request_stateis now in your hands. Write it down between legs and the conversation can resume from a fresh process.- For every entry in
input_requestsyou put anInputResponseunder the same key ininput_responses.fulfilis where your UI goes; this one hard-codes the answer. - Same tool name, same
arguments, every leg. The retry is the original call carried out again, not a new method.
A 2026-07-28 result
InputRequiredResult only exists at protocol version 2026-07-28. The in-memory Client(server) negotiates it for you; over the wire, mode="auto" discovers it. After connecting, client.protocol_version tells you what you got.
Warning
A pre-2026 session has nowhere to put an InputRequiredResult. Return one from your handler on a
mode="legacy" connection and the runner cannot serialize it into the negotiated version; the
client gets back a -32603 "Handler returned an invalid result" error. A server that serves
both eras must check ctx.protocol_version before reaching for it.
Info
URL-mode elicitation rides this exact mechanism on a 2026 connection. The entry in
input_requests is an ElicitRequest whose params are ElicitRequestURLParams; the user
finishes the out-of-band flow and your client retries the call. Same loop, no new API. The
high-level server half is in Elicitation.
Recap
- At 2026-07-28 a server that needs input mid-call returns an
InputRequiredResult. It never opens a request to the client. input_requestsis what it needs.request_stateis an opaque resume token only the server reads.Clientruns the retry loop for you: registerelicitation_callback/sampling_callback/list_roots_callbackandcall_toolreturns a plainCallToolResult.input_required_max_rounds(default 10) bounds it.- To inspect or persist rounds, use
client.session.call_tool(..., allow_input_required=True)and own thewhile isinstance(result, InputRequiredResult)loop yourself. - The server side is the low-level
Serveronly;@mcp.tool()has no sugar for this yet.
This is the mechanism that replaces server-initiated sampling and the rest of the push-style back-channel; see Deprecated features.