Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions docs/advanced/caching.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Caching hints

Every result a server returns for `tools/list`, `prompts/list`, `resources/list`, `resources/templates/list`, `resources/read` and `server/discover` carries two fields on the 2026-07-28 protocol: `ttlMs`, how many milliseconds a client may treat the result as fresh, and `cacheScope`, whether a cached result may be shared across users (`"public"`) or belongs to one authorization context (`"private"`).

The server doesn't cache anything. The fields are a *declaration*: "this tool list is the same for everyone and won't change for a minute." A client (or a gateway in front of you) may then skip the round trip. Honoring the hints is the client's choice; emitting them is the server's job, and the SDK does it for you.

Out of the box every result says `ttlMs: 0, cacheScope: "private"` — immediately stale, never shared. That is always safe and always conformant. If your lists really are stable and identical for all callers, say so at construction:

```python title="server.py" hl_lines="5-8"
--8<-- "docs_src/caching/tutorial001.py"
```

* The map is keyed by **method name** — the six cacheable methods are the only legal keys. The parameter is typed `Mapping[CacheableMethod, CacheHint]`, so your editor autocompletes the keys and flags a typo before you run; anything that slips past the type checker raises at construction.
* A method you don't mention keeps the defaults. The map is a set of overrides, not a manifest.
* `CacheHint(ttl_ms=5_000)` left `scope` unset, so it stays `"private"`: five seconds of freshness, per caller. Scope and TTL are independent decisions.
* `"server/discover"` is a legal key too — the handshake result is cacheable like any list.

!!! warning
`cacheScope: "public"` means *anyone* may be served your cached response — a shared
gateway will happily hand one user's result to another, even when the request was
authenticated. Mark a result `"public"` only when it is identical for every caller, and
never use `cacheScope` as access control: it is a label, not a lock.

## Per-handler override

On the low-level `Server`, handlers build their results by hand — and `ttl_ms` / `cache_scope` are just fields on the result models. A handler that sets them explicitly always wins over the constructor map, field by field:

```python title="server.py" hl_lines="11 17"
--8<-- "docs_src/caching/tutorial002.py"
```

The handler said `ttl_ms=1_000` and nothing about scope. On the wire: `ttlMs: 1000` (the handler's, not the map's `60_000`) and `cacheScope: "public"` (the map's — the handler left it unset). Explicit beats configured, configured beats default — per field, so a handler can pin one field and leave the other to the server-wide policy.

This is also the escape hatch for dynamics the constructor can't know: a handler that filters `resources/read` per user can return `cache_scope="private"` for one URI from an otherwise-public server.

One caveat on paginated lists: the protocol requires the **same `cacheScope` on every page** of one list. The constructor map satisfies that by construction — it's keyed by method, not by page. But a handler that overrides the scope itself owns that consistency: override it on *every* page, never only when a cursor is present, or page one and page two will disagree.

## What the client sees

On the client, the hints arrive as plain fields on every cacheable result — `ttl_ms` and `cache_scope`, already parsed:

```python title="client.py" hl_lines="15"
--8<-- "docs_src/caching/tutorial003.py"
```

The SDK parses; it does not (yet) act. There is no built-in response cache: calling `list_tools()` twice makes two round trips, whatever the TTL said. The spec makes honoring optional — a client that ignores the hints entirely is fully conformant — so until the SDK grows a response cache, the supported path is to read the fields and do your own bookkeeping:

* **Freshness** is `now < t_received + ttl_ms / 1000`: record the clock when the response arrives, and treat the result as reusable until the TTL runs out. `ttl_ms == 0` means *immediately stale* — don't reuse it at all.
* **Scope is a sharing rule, not a suggestion.** A `"private"` result may be reused only within the same authorization context — same access token, same cache. Never put `"private"` results in a cache shared across users.
* **Notifications beat TTL.** If the server sends `list_changed` while your copy is still fresh, the copy is stale now — re-fetch.

Against an **older server** (pre-2026 protocol), the fields are simply absent from the wire, and the models show their conservative defaults: `ttl_ms == 0`, `cache_scope == "private"` — stale and unshared, the right assumption for a server that declared nothing. If you need to distinguish "the server said 0" from "the server said nothing", check `"ttl_ms" in result.model_fields_set`: it's only set when the field actually arrived.

## Older clients

Clients on pre-2026 protocol versions never see either field — the SDK strips them at serialization for those connections. Configure your hints once; there is nothing version-specific to write.

## Recap

* Six methods carry `ttlMs`/`cacheScope`; the SDK defaults them to `0`/`"private"` — stale and unshared, always safe.
* `cache_hints={method: CacheHint(...)}` at construction (both `MCPServer` and `Server`) sets server-wide values per method.
* A handler that sets the fields on its result overrides the map, per field.
* `"public"` is a promise that the result is identical for every caller. It is not access control.
* Clients read the hints as `result.ttl_ms` / `result.cache_scope` and own the caching decision themselves — the SDK has no built-in response cache yet.
2 changes: 1 addition & 1 deletion docs/migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -1598,7 +1598,7 @@ The implementation is responsible for validating the assertion per RFC 7523 §3

### 2025-11-25 and 2026-07-28 protocol fields modeled

`mcp_types` models the 2025-11-25 and 2026-07-28 protocol fields (e.g. `resultType`, `ttlMs`/`cacheScope` on cacheable results, `inputResponses`/`requestState` on retried requests), so inbound payloads carrying these keys parse into typed fields and round-trip. `ttlMs`/`cacheScope` default to `0`/`"private"` (immediately stale, not shared-cacheable); `resultType` defaults to `"complete"` on concrete results (`None` on `EmptyResult`); the server strips all of them from the wire at pre-2026 versions.
`mcp_types` models the 2025-11-25 and 2026-07-28 protocol fields (e.g. `resultType`, `ttlMs`/`cacheScope` on cacheable results, `inputResponses`/`requestState` on retried requests), so inbound payloads carrying these keys parse into typed fields and round-trip. `ttlMs`/`cacheScope` default to `0`/`"private"` (immediately stale, not shared-cacheable); `resultType` defaults to `"complete"` on concrete results (`None` on `EmptyResult`); the server strips all of them from the wire at pre-2026 versions. Servers set per-method values with `cache_hints={method: CacheHint(...)}` on the `Server`/`MCPServer` constructor — see [Caching hints](advanced/caching.md).

### `streamable_http_app()` available on lowlevel Server

Expand Down
Empty file added docs_src/caching/__init__.py
Empty file.
19 changes: 19 additions & 0 deletions docs_src/caching/tutorial001.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
from mcp.server import CacheHint, MCPServer

mcp = MCPServer(
"Weather",
cache_hints={
"tools/list": CacheHint(ttl_ms=60_000, scope="public"),
"resources/read": CacheHint(ttl_ms=5_000),
},
)


@mcp.tool()
def forecast(city: str) -> str:
return f"Sunny in {city}"


@mcp.resource("config://units")
def units() -> str:
return "metric"
18 changes: 18 additions & 0 deletions docs_src/caching/tutorial002.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
from typing import Any

from mcp_types import ListToolsResult, PaginatedRequestParams, Tool

from mcp.server import CacheHint, Server, ServerRequestContext

TOOLS = [Tool(name="forecast", input_schema={"type": "object"})]


async def list_tools(ctx: ServerRequestContext[Any], params: PaginatedRequestParams | None) -> ListToolsResult:
return ListToolsResult(tools=TOOLS, ttl_ms=1_000)


server = Server(
"Weather",
on_list_tools=list_tools,
cache_hints={"tools/list": CacheHint(ttl_ms=60_000, scope="public")},
)
15 changes: 15 additions & 0 deletions docs_src/caching/tutorial003.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
from mcp import Client
from mcp.server import CacheHint, MCPServer

mcp = MCPServer("Weather", cache_hints={"tools/list": CacheHint(ttl_ms=60_000, scope="public")})


@mcp.tool()
def forecast(city: str) -> str:
return f"Sunny in {city}"


async def main() -> None:
async with Client(mcp) as client:
tools = await client.list_tools()
print(f"{len(tools.tools)} tools, fresh for {tools.ttl_ms / 1000:.0f}s, scope={tools.cache_scope}")
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ nav:
- The low-level Server: advanced/low-level-server.md
- URI templates: advanced/uri-templates.md
- Pagination: advanced/pagination.md
- Caching hints: advanced/caching.md
- Middleware: advanced/middleware.md
- Extensions: advanced/extensions.md
- MCP Apps: advanced/apps.md
Expand Down
3 changes: 2 additions & 1 deletion src/mcp/server/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from .caching import CacheHint
from .context import ServerRequestContext
from .lowlevel import NotificationOptions, Server
from .mcpserver import MCPServer
from .models import InitializationOptions

__all__ = ["Server", "ServerRequestContext", "MCPServer", "NotificationOptions", "InitializationOptions"]
__all__ = ["CacheHint", "Server", "ServerRequestContext", "MCPServer", "NotificationOptions", "InitializationOptions"]
98 changes: 98 additions & 0 deletions src/mcp/server/caching.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
"""Server-side caching hints (SEP-2549, protocol revision 2026-07-28).

Results for the cacheable methods carry `ttlMs`/`cacheScope` freshness hints.
A handler sets them by returning a result with explicit `ttl_ms`/`cache_scope`
values; `Server(cache_hints={method: CacheHint(...)})` fills them for handlers
that don't. Fields the handler set win, per field, so a server-wide hint never
overrides a handler's explicit choice.
"""

from __future__ import annotations

from collections.abc import Mapping
from dataclasses import dataclass
from typing import Any, Final, Literal, TypeVar, get_args

import mcp_types as types

__all__ = ["CACHEABLE_METHODS", "CacheHint", "CacheableMethod", "apply_cache_hint", "validate_cache_hints"]

CacheableMethod = Literal[
"prompts/list",
"resources/list",
"resources/read",
"resources/templates/list",
"server/discover",
"tools/list",
]
"""The methods whose results carry `ttlMs`/`cacheScope`. Closed set: the spec
defines caching hints on exactly these six (tests pin it to which result models
mix in `CacheableResult`)."""

CACHEABLE_METHODS: Final[frozenset[str]] = frozenset(get_args(CacheableMethod))
"""Runtime mirror of `CacheableMethod`, for callers the type checker can't see."""


@dataclass(frozen=True, slots=True)
class CacheHint:
"""Freshness hint for one cacheable method's results.

`ttl_ms` is how long, in milliseconds, a client may consider the result
fresh (`0` means immediately stale). `scope` is whether a cached result may
be shared across authorization contexts (`"public"`) or only reused within
the one that produced it (`"private"`).
"""

ttl_ms: int = 0
scope: Literal["public", "private"] = "private"

def __post_init__(self) -> None:
if self.ttl_ms < 0:
raise ValueError(f"ttl_ms must be >= 0, got {self.ttl_ms}")
if self.scope not in ("public", "private"):
raise ValueError(f"scope must be 'public' or 'private', got {self.scope!r}")


CacheableResultT = TypeVar("CacheableResultT", bound=types.CacheableResult)


def apply_cache_hint(result: CacheableResultT, hint: CacheHint) -> CacheableResultT:
"""Fill `ttl_ms`/`cache_scope` on `result` from `hint`.

Per-field: a field the handler set explicitly - even to its default value,
tracked via `model_fields_set` - is left alone; only unset fields take the

Check warning on line 63 in src/mcp/server/caching.py

View check run for this annotation

Claude / Claude Code Review

apply_cache_hint docstring misstates model_construct behavior

The docstring caveat "A handler constructing results with `model_construct` bypasses that tracking and is treated as having set nothing" is factually wrong: pydantic's `model_construct` defaults `__pydantic_fields_set__` to the keys actually passed, so an explicit `ttl_ms`/`cache_scope` passed via `model_construct` still wins over the configured hint, exactly like normal construction. The runtime behavior is correct — only the sentence needs correcting or dropping (it only holds when a handler p
Comment on lines +56 to +63

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The docstring caveat "A handler constructing results with model_construct bypasses that tracking and is treated as having set nothing" is factually wrong: pydantic's model_construct defaults __pydantic_fields_set__ to the keys actually passed, so an explicit ttl_ms/cache_scope passed via model_construct still wins over the configured hint, exactly like normal construction. The runtime behavior is correct — only the sentence needs correcting or dropping (it only holds when a handler passes _fields_set=set() explicitly).

Extended reasoning...

What the docstring claims vs. what pydantic does. apply_cache_hint's docstring (src/mcp/server/caching.py:56-63) says a handler that builds its result with model_construct "bypasses that tracking and is treated as having set nothing." In pydantic v2, however, model_construct only treats the result as having set nothing when _fields_set is explicitly passed (e.g. _fields_set=set()). When _fields_set is omitted — the common case — pydantic computes __pydantic_fields_set__ from the keyword arguments actually passed, the same as normal validation-based construction.

Concrete proof (verified against the installed pydantic 2.12.5):

from mcp_types import ListToolsResult
from mcp.server.caching import CacheHint, apply_cache_hint

r = ListToolsResult.model_construct(tools=[], ttl_ms=10)
print(r.model_fields_set)        # {'tools', 'ttl_ms'}

filled = apply_cache_hint(r, CacheHint(ttl_ms=60_000, scope="public"))
print(filled.ttl_ms)             # 10  — the handler's explicit value survives
print(filled.cache_scope)        # "public" — only the unset field takes the hint

Step by step: (1) model_construct(tools=[], ttl_ms=10) is called with no _fields_set; (2) pydantic populates __pydantic_fields_set__ = {'tools', 'ttl_ms'} from the passed keys; (3) apply_cache_hint checks "ttl_ms" not in result.model_fields_set, which is False, so it leaves ttl_ms=10 alone and only fills cache_scope. That is identical to the behavior for a normally-constructed ListToolsResult(tools=[], ttl_ms=10) — there is no "bypass".

Why it matters. The runtime code is correct and consistent with the documented precedence model (explicit beats configured, per field), so this is not a behavioral bug. But the caveat could mislead lowlevel-handler authors into believing that using model_construct causes their explicitly-set ttl_ms/cache_scope to be clobbered by the server-wide cache_hints map, prompting unnecessary workarounds (or avoidance of model_construct altogether). The only scenario the sentence accurately describes is a handler that passes _fields_set=set() explicitly, or one that omits the fields entirely — and the latter is true of normal construction too.

How to fix. Either drop the sentence, or reword it to describe the actual edge case, e.g.: "A handler that calls model_construct with an explicit _fields_set that omits these fields is treated as having set nothing for them; by default model_construct tracks the fields actually passed, so explicit values still win."

All three verifiers independently confirmed the factual claim empirically against pydantic 2.12.5 in this repo; none refuted it. Docstring-only, non-blocking.

hint. A handler constructing results with `model_construct` bypasses that
tracking and is treated as having set nothing.
"""
update: dict[str, int | str] = {}
if "ttl_ms" not in result.model_fields_set:
update["ttl_ms"] = hint.ttl_ms
if "cache_scope" not in result.model_fields_set:
update["cache_scope"] = hint.scope
return result.model_copy(update=update) if update else result


def validate_cache_hints(cache_hints: Mapping[Any, Any] | None) -> dict[str, CacheHint]:
"""Validate a `cache_hints` constructor argument into a plain dict.

The `Server`/`MCPServer` signatures already close the key set and value
type for type-checked callers; this runtime gate is deliberately loose in
its parameter so it covers everyone else (e.g. a map deserialized from
config) - a bad entry fails at construction, not on the first request to
that method.

Raises:
ValueError: If a key is not a cacheable method.
TypeError: If a value is not a `CacheHint`.
"""
if cache_hints is None:
return {}
unknown = sorted(method for method in cache_hints if method not in CACHEABLE_METHODS)

@cubic-dev-ai cubic-dev-ai Bot Jun 29, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Unknown cache_hints keys are formatted as sortable strings, so non-string keys can raise TypeError before the intended ValueError validation error.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/mcp/server/caching.py, line 90:

<comment>Unknown `cache_hints` keys are formatted as sortable strings, so non-string keys can raise `TypeError` before the intended `ValueError` validation error.</comment>

<file context>
@@ -0,0 +1,98 @@
+    """
+    if cache_hints is None:
+        return {}
+    unknown = sorted(method for method in cache_hints if method not in CACHEABLE_METHODS)
+    if unknown:
+        raise ValueError(f"cache_hints keys must be cacheable methods (see CacheableMethod); got: {', '.join(unknown)}")
</file context>
Fix with cubic

if unknown:
raise ValueError(f"cache_hints keys must be cacheable methods (see CacheableMethod); got: {', '.join(unknown)}")
validated: dict[str, CacheHint] = {}
for method, hint in cache_hints.items():
if not isinstance(hint, CacheHint):
raise TypeError(f"cache_hints[{method!r}] must be a CacheHint, got {type(hint).__name__}")
validated[method] = hint
return validated
9 changes: 8 additions & 1 deletion src/mcp/server/lowlevel/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ async def main():

import logging
import warnings
from collections.abc import AsyncIterator, Awaitable, Callable
from collections.abc import AsyncIterator, Awaitable, Callable, Mapping
from contextlib import AbstractAsyncContextManager, asynccontextmanager
from dataclasses import dataclass
from importlib.metadata import version as importlib_version
Expand All @@ -59,6 +59,7 @@ async def main():
from mcp.server.auth.provider import OAuthAuthorizationServerProvider, TokenVerifier
from mcp.server.auth.routes import build_resource_metadata_url, create_auth_routes, create_protected_resource_routes
from mcp.server.auth.settings import AuthSettings
from mcp.server.caching import CacheableMethod, CacheHint, validate_cache_hints
from mcp.server.context import HandlerResult, ServerMiddleware, ServerRequestContext
from mcp.server.models import InitializationOptions
from mcp.server.runner import serve_loop
Expand Down Expand Up @@ -140,6 +141,7 @@ def __init__(
instructions: str | None = None,
website_url: str | None = None,
icons: list[types.Icon] | None = None,
cache_hints: Mapping[CacheableMethod, CacheHint] | None = None,
lifespan: Callable[
[Server[LifespanResultT]],
AbstractAsyncContextManager[LifespanResultT],
Expand Down Expand Up @@ -222,6 +224,7 @@ def __init__(
instructions: str | None = None,
website_url: str | None = None,
icons: list[types.Icon] | None = None,
cache_hints: Mapping[CacheableMethod, CacheHint] | None = None,
lifespan: Callable[
[Server[LifespanResultT]],
AbstractAsyncContextManager[LifespanResultT],
Expand Down Expand Up @@ -313,6 +316,7 @@ def __init__(
instructions: str | None = None,
website_url: str | None = None,
icons: list[types.Icon] | None = None,
cache_hints: Mapping[CacheableMethod, CacheHint] | None = None,
lifespan: Callable[
[Server[LifespanResultT]],
AbstractAsyncContextManager[LifespanResultT],
Expand Down Expand Up @@ -420,6 +424,9 @@ def __init__(
self.instructions = instructions
self.website_url = website_url
self.icons = icons
# Per-method `ttl_ms`/`cache_scope` fills, applied by `ServerRunner`
# after the handler returns; fields the handler set explicitly win.
self.cache_hints: dict[str, CacheHint] = validate_cache_hints(cache_hints)
self.lifespan = lifespan
self._request_handlers: dict[str, HandlerEntry[LifespanResultT]] = {}
self._notification_handlers: dict[str, HandlerEntry[LifespanResultT]] = {}
Expand Down
5 changes: 4 additions & 1 deletion src/mcp/server/mcpserver/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

import base64
import inspect
from collections.abc import AsyncIterator, Awaitable, Callable, Iterable, Sequence
from collections.abc import AsyncIterator, Awaitable, Callable, Iterable, Mapping, Sequence
from contextlib import AbstractAsyncContextManager, asynccontextmanager
from typing import Any, Generic, Literal, TypeVar, overload

Expand Down Expand Up @@ -58,6 +58,7 @@
from mcp.server.auth.middleware.bearer_auth import BearerAuthBackend, RequireAuthMiddleware
from mcp.server.auth.provider import OAuthAuthorizationServerProvider, ProviderTokenVerifier, TokenVerifier
from mcp.server.auth.settings import AuthSettings
from mcp.server.caching import CacheableMethod, CacheHint
from mcp.server.context import HandlerResult, ServerRequestContext
from mcp.server.extension import (
Extension,
Expand Down Expand Up @@ -169,6 +170,7 @@ def __init__(
lifespan: Callable[[MCPServer[LifespanResultT]], AbstractAsyncContextManager[LifespanResultT]] | None = None,
auth: AuthSettings | None = None,
resource_security: ResourceSecurity = DEFAULT_RESOURCE_SECURITY,
cache_hints: Mapping[CacheableMethod, CacheHint] | None = None,
):
self._resource_security = resource_security
self.settings = Settings(
Expand Down Expand Up @@ -196,6 +198,7 @@ def __init__(
website_url=website_url,
icons=icons,
version=version,
cache_hints=cache_hints,
on_list_tools=self._handle_list_tools,
on_call_tool=self._handle_call_tool,
on_list_resources=self._handle_list_resources,
Expand Down
8 changes: 8 additions & 0 deletions src/mcp/server/runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
INVALID_PARAMS,
METHOD_NOT_FOUND,
PROTOCOL_VERSION_META_KEY,
CacheableResult,
ErrorData,
Implementation,
InitializeRequestParams,
Expand All @@ -40,6 +41,7 @@
from pydantic import BaseModel, ValidationError
from typing_extensions import TypeVar

from mcp.server.caching import apply_cache_hint
from mcp.server.connection import Connection
from mcp.server.context import CallNext, HandlerResult, ServerMiddleware, ServerRequestContext
from mcp.server.models import InitializationOptions
Expand Down Expand Up @@ -196,6 +198,12 @@ async def _inner(ctx: ServerRequestContext[LifespanT, Any]) -> HandlerResult:
if isinstance(result, ErrorData):
# Raise inside the chain so middleware observes the failure.
raise MCPError.from_error_data(result)
# Fill cache hints on the typed result, before the serialize sieve
# decides whether the negotiated version carries the fields at all.
# `input_required` interim results are not `CacheableResult` models,
# so the MRTR carve-out (no hints on them) holds by shape.
if isinstance(result, CacheableResult) and (hint := self.server.cache_hints.get(method)) is not None:
result = apply_cache_hint(result, hint)
Comment on lines +201 to +206

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 When a lowlevel handler (registered via add_request_handler, or a middleware short-circuit) returns a plain dict for a cacheable method like tools/list, the new hint-stamping step is skipped because it only runs for isinstance(result, CacheableResult), so the configured cache_hints entry is silently dropped and the 2026 wire emits the schema defaults ttlMs: 0 / cacheScope: "private" instead. Consider also filling absent keys on the dict path, or documenting the limitation in docs/advanced/caching.md.

Extended reasoning...

What the bug is. HandlerResult (src/mcp/server/context.py:114) is BaseModel | dict[str, Any] | None, so a handler registered with add_request_handler (and a server-tier middleware that short-circuits) may legitimately return a plain dict for one of the six cacheable methods. The PR's stamping step in src/mcp/server/runner.py:205 is gated on isinstance(result, CacheableResult), which a dict never satisfies, so apply_cache_hint is never invoked for dict results. The dict then flows through _serializeserialize_server_result, where the 2026-07-28 surface model fills its non-None defaults (ttlMs=0, cacheScope="private"), and exclude_none=True keeps them — so the wire actively asserts "immediately stale, never shared" despite the operator's configured CacheHint.\n\nCode path. Concrete walk-through:\n\n1. Server("srv", cache_hints={"tools/list": CacheHint(ttl_ms=60_000, scope="public")}).\n2. server.add_request_handler("tools/list", PaginatedRequestParams, handler) where handler returns {"tools": [{"name": "t", "inputSchema": {"type": "object"}}]} — exactly the shape exercised by existing tests such as test_runner_handler_returning_malformed_dict_for_spec_method_is_internal_error (well-formed variant) and the _dump_result dict branch.\n3. A 2026-07-28 client calls tools/list. In _inner, result is a dict, so isinstance(result, CacheableResult) is False and the apply_cache_hint line at runner.py:205-206 is skipped.\n4. _serialize calls serialize_server_result("tools/list", "2026-07-28", dumped); the result model validates the dict, fills ttlMs=0 / cacheScope="private" from its defaults, and dumps them onto the wire.\n5. The client sees ttlMs: 0, cacheScope: "private" — the operator's 60s/public hint never applied, with no warning or error.\n\nWhy existing code doesn't prevent it. The precedence mechanism (model_fields_set in apply_cache_hint) only exists on the typed-result path; a dict has no model_fields_set, and the serialize sieve has no knowledge of cache_hints. The PR's tests all return typed result models, so this path is untested. This contradicts the documented model in docs/advanced/caching.md ("explicit beats configured, configured beats default", "emitting them is the server's job, and the SDK does it for you"): a dict that omits both keys set neither field explicitly, yet still gets the defaults rather than the configured hint.\n\nImpact. Narrow but real: it only bites the combination of cache_hints + a dict-returning handler for one of the six cacheable methods at a 2026 negotiation. All typed on_* constructor handlers and the entire MCPServer tier return result models and are unaffected, and the failure mode is conservative (defaults mean no caching, never an unsafe over-share). The authors clearly considered nearby edge cases (the model_construct caveat in apply_cache_hint's docstring), but this one is neither stamped nor documented.\n\nHow to fix. Either (a) extend the runner so dict results for cacheable methods also get the configured hint, filling only keys absent from the dict (e.g. check "ttlMs" not in result / "cacheScope" not in result before inserting the hint values), preserving the per-field precedence model; or (b) add a sentence to docs/advanced/caching.md noting that cache_hints only applies to typed result models, so handlers returning raw dicts (and short-circuiting middleware) own the ttlMs/cacheScope keys themselves.

# Dump and serialize inside the chain so the OpenTelemetry span (the
# outermost middleware) records a failing handler return shape too.
return self._serialize(method, version, result)
Expand Down
Loading
Loading