Media
Text is not the only thing a tool can return.
The SDK ships two helpers for binary results (Image and Audio) and an Icon type for giving your server, tools, resources, and prompts a face in the client's UI.
Returning an image
Annotate the return type as Image and return one:
import base64
from mcp.server import MCPServer
from mcp.server.mcpserver import Image
mcp = MCPServer("Brand kit")
LOGO_PNG = base64.b64decode(
"iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGOQ9bsBAAHPAURf8l/aAAAAAElFTkSuQmCC"
)
@mcp.tool()
def logo() -> Image:
"""The brand logo as a PNG."""
return Image(data=LOGO_PNG, format="png")
Imagetakes exactly one ofdata(raw bytes) orpath(a file to read).format="png"becomes the MIME type the client sees:image/png.- The bytes here are a one-pixel placeholder so the file runs on its own. In a real server they come from Pillow, matplotlib, a headless browser, or anything else that hands you
bytes.
Image is an SDK convenience, not a protocol type. On the wire your return value becomes an ImageContent block (your bytes base64-encoded, plus the MIME type):
result.content # [ImageContent(type="image", data="iVBORw0KGgoAAAANSUhEUg...", mime_type="image/png")]
result.structured_content # None
Two things to notice:
datais base64. You returned rawbytes; the SDK did the encoding.structured_contentisNone. AnImageis content for the model to look at, not data for the application to parse: there is no output schema. (Contrast Structured Output, where the return annotation is the schema.)
Info
ImageContent and AudioContent live in mcp_types, right next to the TextContent
you met in Tools. A tool result is a list of content blocks; Image and Audio are
the shortest way to produce the two binary kinds.
Try it
uv run mcp dev server.py
Open the Tools tab and call logo. The result is not a string: it is an image content block, and the Inspector renders it as a picture. You returned bytes; everything between that and the pixels on screen was the SDK.
Returning audio
Audio is the same shape:
import base64
from mcp.server import MCPServer
from mcp.server.mcpserver import Audio, Image
mcp = MCPServer("Brand kit")
LOGO_PNG = base64.b64decode(
"iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGOQ9bsBAAHPAURf8l/aAAAAAElFTkSuQmCC"
)
CHIME_WAV = base64.b64decode("UklGRjQAAABXQVZFZm10IBAAAAABAAEAQB8AAIA+AAACABAAZGF0YRAAAAAAAAAAAAAAAAAAAAAAAAAA")
@mcp.tool()
def logo() -> Image:
"""The brand logo as a PNG."""
return Image(data=LOGO_PNG, format="png")
@mcp.tool()
def chime() -> Audio:
"""The notification chime as a WAV."""
return Audio(data=CHIME_WAV, format="wav")
The result is an AudioContent block:
result.content # [AudioContent(type="audio", data="UklGRjQAAABXQVZFZm1...", mime_type="audio/wav")]
result.structured_content # None
Same deal: raw bytes in, base64 and a MIME type out, no output schema.
Bytes or a file
Both helpers also accept path= instead of data=. The file is read when the result is built, and the MIME type is guessed from the suffix:
Image:.png,.jpg,.jpeg,.gif,.webp.Audio:.wav,.mp3,.ogg,.flac,.aac,.m4a.
A suffix it doesn't recognise falls back to application/octet-stream.
Check
With data= there is no filename, so there is nothing to guess from. Forget format= and
the SDK falls back to a default: image/png for images, audio/wav for audio. Build an
Audio from MP3 bytes that way and the client is told mime_type="audio/wav", then
faithfully fails to decode it. When you pass data=, pass format=.
Icons
An Icon is metadata, not content. It doesn't carry the image; it points at one with a URI, and a client may fetch it and show it next to your server's name, a tool, a resource, or a prompt.
from mcp_types import Icon
from mcp.server import MCPServer
LOGO = Icon(src="https://example.com/brand-kit.png", mime_type="image/png", sizes=["48x48"])
PALETTE = Icon(src="https://example.com/palette.svg", mime_type="image/svg+xml", sizes=["any"])
mcp = MCPServer("Brand kit", icons=[LOGO])
@mcp.tool(icons=[PALETTE])
def palette() -> list[str]:
"""The brand colour palette as hex codes."""
return ["#1d4ed8", "#f59e0b", "#10b981"]
@mcp.resource("brand://guidelines", icons=[LOGO])
def guidelines() -> str:
"""How to use the brand assets."""
return "Use the primary colour for calls to action."
srcis a URI the client can resolve:https:, or adata:URI if you want the icon embedded with no extra fetch.mime_typeandsizes("48x48", or"any"for a scalable format) let the client pick the right one when you offer several.theme="light"ortheme="dark"marks an icon for one colour scheme.
The same icons=[...] keyword is accepted by MCPServer(...), @mcp.tool(), @mcp.resource(), and @mcp.prompt().
Where a client sees them
Icons travel with whatever they decorate. The server's arrive during the handshake, on client.server_info:
client.server_info.icons # [Icon(src="https://example.com/brand-kit.png", mime_type="image/png", sizes=["48x48"])]
A tool's icons are on the Tool object from tools/list, a resource's on the Resource from resources/list, a prompt's on the Prompt from prompts/list. The field is always called icons.
Recap
- Return an
ImageorAudiofrom a tool and the client receives anImageContent/AudioContentblock: your bytes base64-encoded, with a MIME type. - Build one from in-memory
data=plus an explicitformat=, or from apath=and let the suffix decide. - Media results carry no
structured_contentand no output schema. - An
Iconis a pointer: asrcURI plus optionalmime_type,sizes, andtheme. icons=[...]works on the server, on tools, on resources, and on prompts, and clients find them on the matching objects.
That is everything a tool can put into a result. Helping the user fill in a prompt's or a resource template's arguments before anything runs is Completions.