-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Add cache_hints constructor map for SEP-2549 caching hints #3015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,64 @@ | ||
| # Caching hints | ||
|
|
||
| Every result a server returns for `tools/list`, `prompts/list`, `resources/list`, `resources/templates/list`, `resources/read` and `server/discover` carries two fields on the 2026-07-28 protocol: `ttlMs`, how many milliseconds a client may treat the result as fresh, and `cacheScope`, whether a cached result may be shared across users (`"public"`) or belongs to one authorization context (`"private"`). | ||
|
|
||
| The server doesn't cache anything. The fields are a *declaration*: "this tool list is the same for everyone and won't change for a minute." A client (or a gateway in front of you) may then skip the round trip. Honoring the hints is the client's choice; emitting them is the server's job, and the SDK does it for you. | ||
|
|
||
| Out of the box every result says `ttlMs: 0, cacheScope: "private"` — immediately stale, never shared. That is always safe and always conformant. If your lists really are stable and identical for all callers, say so at construction: | ||
|
|
||
| ```python title="server.py" hl_lines="5-8" | ||
| --8<-- "docs_src/caching/tutorial001.py" | ||
| ``` | ||
|
|
||
| * The map is keyed by **method name** — the six cacheable methods are the only legal keys. The parameter is typed `Mapping[CacheableMethod, CacheHint]`, so your editor autocompletes the keys and flags a typo before you run; anything that slips past the type checker raises at construction. | ||
| * A method you don't mention keeps the defaults. The map is a set of overrides, not a manifest. | ||
| * `CacheHint(ttl_ms=5_000)` left `scope` unset, so it stays `"private"`: five seconds of freshness, per caller. Scope and TTL are independent decisions. | ||
| * `"server/discover"` is a legal key too — the handshake result is cacheable like any list. | ||
|
|
||
| !!! warning | ||
| `cacheScope: "public"` means *anyone* may be served your cached response — a shared | ||
| gateway will happily hand one user's result to another, even when the request was | ||
| authenticated. Mark a result `"public"` only when it is identical for every caller, and | ||
| never use `cacheScope` as access control: it is a label, not a lock. | ||
|
|
||
| ## Per-handler override | ||
|
|
||
| On the low-level `Server`, handlers build their results by hand — and `ttl_ms` / `cache_scope` are just fields on the result models. A handler that sets them explicitly always wins over the constructor map, field by field: | ||
|
|
||
| ```python title="server.py" hl_lines="11 17" | ||
| --8<-- "docs_src/caching/tutorial002.py" | ||
| ``` | ||
|
|
||
| The handler said `ttl_ms=1_000` and nothing about scope. On the wire: `ttlMs: 1000` (the handler's, not the map's `60_000`) and `cacheScope: "public"` (the map's — the handler left it unset). Explicit beats configured, configured beats default — per field, so a handler can pin one field and leave the other to the server-wide policy. | ||
|
|
||
| This is also the escape hatch for dynamics the constructor can't know: a handler that filters `resources/read` per user can return `cache_scope="private"` for one URI from an otherwise-public server. | ||
|
|
||
| One caveat on paginated lists: the protocol requires the **same `cacheScope` on every page** of one list. The constructor map satisfies that by construction — it's keyed by method, not by page. But a handler that overrides the scope itself owns that consistency: override it on *every* page, never only when a cursor is present, or page one and page two will disagree. | ||
|
|
||
| ## What the client sees | ||
|
|
||
| On the client, the hints arrive as plain fields on every cacheable result — `ttl_ms` and `cache_scope`, already parsed: | ||
|
|
||
| ```python title="client.py" hl_lines="15" | ||
| --8<-- "docs_src/caching/tutorial003.py" | ||
| ``` | ||
|
|
||
| The SDK parses; it does not (yet) act. There is no built-in response cache: calling `list_tools()` twice makes two round trips, whatever the TTL said. The spec makes honoring optional — a client that ignores the hints entirely is fully conformant — so until the SDK grows a response cache, the supported path is to read the fields and do your own bookkeeping: | ||
|
|
||
| * **Freshness** is `now < t_received + ttl_ms / 1000`: record the clock when the response arrives, and treat the result as reusable until the TTL runs out. `ttl_ms == 0` means *immediately stale* — don't reuse it at all. | ||
| * **Scope is a sharing rule, not a suggestion.** A `"private"` result may be reused only within the same authorization context — same access token, same cache. Never put `"private"` results in a cache shared across users. | ||
| * **Notifications beat TTL.** If the server sends `list_changed` while your copy is still fresh, the copy is stale now — re-fetch. | ||
|
|
||
| Against an **older server** (pre-2026 protocol), the fields are simply absent from the wire, and the models show their conservative defaults: `ttl_ms == 0`, `cache_scope == "private"` — stale and unshared, the right assumption for a server that declared nothing. If you need to distinguish "the server said 0" from "the server said nothing", check `"ttl_ms" in result.model_fields_set`: it's only set when the field actually arrived. | ||
|
|
||
| ## Older clients | ||
|
|
||
| Clients on pre-2026 protocol versions never see either field — the SDK strips them at serialization for those connections. Configure your hints once; there is nothing version-specific to write. | ||
|
|
||
| ## Recap | ||
|
|
||
| * Six methods carry `ttlMs`/`cacheScope`; the SDK defaults them to `0`/`"private"` — stale and unshared, always safe. | ||
| * `cache_hints={method: CacheHint(...)}` at construction (both `MCPServer` and `Server`) sets server-wide values per method. | ||
| * A handler that sets the fields on its result overrides the map, per field. | ||
| * `"public"` is a promise that the result is identical for every caller. It is not access control. | ||
| * Clients read the hints as `result.ttl_ms` / `result.cache_scope` and own the caching decision themselves — the SDK has no built-in response cache yet. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,19 @@ | ||
| from mcp.server import CacheHint, MCPServer | ||
|
|
||
| mcp = MCPServer( | ||
| "Weather", | ||
| cache_hints={ | ||
| "tools/list": CacheHint(ttl_ms=60_000, scope="public"), | ||
| "resources/read": CacheHint(ttl_ms=5_000), | ||
| }, | ||
| ) | ||
|
|
||
|
|
||
| @mcp.tool() | ||
| def forecast(city: str) -> str: | ||
| return f"Sunny in {city}" | ||
|
|
||
|
|
||
| @mcp.resource("config://units") | ||
| def units() -> str: | ||
| return "metric" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| from typing import Any | ||
|
|
||
| from mcp_types import ListToolsResult, PaginatedRequestParams, Tool | ||
|
|
||
| from mcp.server import CacheHint, Server, ServerRequestContext | ||
|
|
||
| TOOLS = [Tool(name="forecast", input_schema={"type": "object"})] | ||
|
|
||
|
|
||
| async def list_tools(ctx: ServerRequestContext[Any], params: PaginatedRequestParams | None) -> ListToolsResult: | ||
| return ListToolsResult(tools=TOOLS, ttl_ms=1_000) | ||
|
|
||
|
|
||
| server = Server( | ||
| "Weather", | ||
| on_list_tools=list_tools, | ||
| cache_hints={"tools/list": CacheHint(ttl_ms=60_000, scope="public")}, | ||
| ) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| from mcp import Client | ||
| from mcp.server import CacheHint, MCPServer | ||
|
|
||
| mcp = MCPServer("Weather", cache_hints={"tools/list": CacheHint(ttl_ms=60_000, scope="public")}) | ||
|
|
||
|
|
||
| @mcp.tool() | ||
| def forecast(city: str) -> str: | ||
| return f"Sunny in {city}" | ||
|
|
||
|
|
||
| async def main() -> None: | ||
| async with Client(mcp) as client: | ||
| tools = await client.list_tools() | ||
| print(f"{len(tools.tools)} tools, fresh for {tools.ttl_ms / 1000:.0f}s, scope={tools.cache_scope}") |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,6 +1,7 @@ | ||
| from .caching import CacheHint | ||
| from .context import ServerRequestContext | ||
| from .lowlevel import NotificationOptions, Server | ||
| from .mcpserver import MCPServer | ||
| from .models import InitializationOptions | ||
|
|
||
| __all__ = ["Server", "ServerRequestContext", "MCPServer", "NotificationOptions", "InitializationOptions"] | ||
| __all__ = ["CacheHint", "Server", "ServerRequestContext", "MCPServer", "NotificationOptions", "InitializationOptions"] |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,98 @@ | ||
| """Server-side caching hints (SEP-2549, protocol revision 2026-07-28). | ||
|
|
||
| Results for the cacheable methods carry `ttlMs`/`cacheScope` freshness hints. | ||
| A handler sets them by returning a result with explicit `ttl_ms`/`cache_scope` | ||
| values; `Server(cache_hints={method: CacheHint(...)})` fills them for handlers | ||
| that don't. Fields the handler set win, per field, so a server-wide hint never | ||
| overrides a handler's explicit choice. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| from collections.abc import Mapping | ||
| from dataclasses import dataclass | ||
| from typing import Any, Final, Literal, TypeVar, get_args | ||
|
|
||
| import mcp_types as types | ||
|
|
||
| __all__ = ["CACHEABLE_METHODS", "CacheHint", "CacheableMethod", "apply_cache_hint", "validate_cache_hints"] | ||
|
|
||
| CacheableMethod = Literal[ | ||
| "prompts/list", | ||
| "resources/list", | ||
| "resources/read", | ||
| "resources/templates/list", | ||
| "server/discover", | ||
| "tools/list", | ||
| ] | ||
| """The methods whose results carry `ttlMs`/`cacheScope`. Closed set: the spec | ||
| defines caching hints on exactly these six (tests pin it to which result models | ||
| mix in `CacheableResult`).""" | ||
|
|
||
| CACHEABLE_METHODS: Final[frozenset[str]] = frozenset(get_args(CacheableMethod)) | ||
| """Runtime mirror of `CacheableMethod`, for callers the type checker can't see.""" | ||
|
|
||
|
|
||
| @dataclass(frozen=True, slots=True) | ||
| class CacheHint: | ||
| """Freshness hint for one cacheable method's results. | ||
|
|
||
| `ttl_ms` is how long, in milliseconds, a client may consider the result | ||
| fresh (`0` means immediately stale). `scope` is whether a cached result may | ||
| be shared across authorization contexts (`"public"`) or only reused within | ||
| the one that produced it (`"private"`). | ||
| """ | ||
|
|
||
| ttl_ms: int = 0 | ||
| scope: Literal["public", "private"] = "private" | ||
|
|
||
| def __post_init__(self) -> None: | ||
| if self.ttl_ms < 0: | ||
| raise ValueError(f"ttl_ms must be >= 0, got {self.ttl_ms}") | ||
| if self.scope not in ("public", "private"): | ||
| raise ValueError(f"scope must be 'public' or 'private', got {self.scope!r}") | ||
|
|
||
|
|
||
| CacheableResultT = TypeVar("CacheableResultT", bound=types.CacheableResult) | ||
|
|
||
|
|
||
| def apply_cache_hint(result: CacheableResultT, hint: CacheHint) -> CacheableResultT: | ||
| """Fill `ttl_ms`/`cache_scope` on `result` from `hint`. | ||
|
|
||
| Per-field: a field the handler set explicitly - even to its default value, | ||
| tracked via `model_fields_set` - is left alone; only unset fields take the | ||
|
Check warning on line 63 in src/mcp/server/caching.py
|
||
| hint. A handler constructing results with `model_construct` bypasses that | ||
| tracking and is treated as having set nothing. | ||
| """ | ||
| update: dict[str, int | str] = {} | ||
| if "ttl_ms" not in result.model_fields_set: | ||
| update["ttl_ms"] = hint.ttl_ms | ||
| if "cache_scope" not in result.model_fields_set: | ||
| update["cache_scope"] = hint.scope | ||
| return result.model_copy(update=update) if update else result | ||
|
|
||
|
|
||
| def validate_cache_hints(cache_hints: Mapping[Any, Any] | None) -> dict[str, CacheHint]: | ||
| """Validate a `cache_hints` constructor argument into a plain dict. | ||
|
|
||
| The `Server`/`MCPServer` signatures already close the key set and value | ||
| type for type-checked callers; this runtime gate is deliberately loose in | ||
| its parameter so it covers everyone else (e.g. a map deserialized from | ||
| config) - a bad entry fails at construction, not on the first request to | ||
| that method. | ||
|
|
||
| Raises: | ||
| ValueError: If a key is not a cacheable method. | ||
| TypeError: If a value is not a `CacheHint`. | ||
| """ | ||
| if cache_hints is None: | ||
| return {} | ||
| unknown = sorted(method for method in cache_hints if method not in CACHEABLE_METHODS) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. P2: Unknown Prompt for AI agents |
||
| if unknown: | ||
| raise ValueError(f"cache_hints keys must be cacheable methods (see CacheableMethod); got: {', '.join(unknown)}") | ||
| validated: dict[str, CacheHint] = {} | ||
| for method, hint in cache_hints.items(): | ||
| if not isinstance(hint, CacheHint): | ||
| raise TypeError(f"cache_hints[{method!r}] must be a CacheHint, got {type(hint).__name__}") | ||
| validated[method] = hint | ||
| return validated | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -28,6 +28,7 @@ | |
| INVALID_PARAMS, | ||
| METHOD_NOT_FOUND, | ||
| PROTOCOL_VERSION_META_KEY, | ||
| CacheableResult, | ||
| ErrorData, | ||
| Implementation, | ||
| InitializeRequestParams, | ||
|
|
@@ -40,6 +41,7 @@ | |
| from pydantic import BaseModel, ValidationError | ||
| from typing_extensions import TypeVar | ||
|
|
||
| from mcp.server.caching import apply_cache_hint | ||
| from mcp.server.connection import Connection | ||
| from mcp.server.context import CallNext, HandlerResult, ServerMiddleware, ServerRequestContext | ||
| from mcp.server.models import InitializationOptions | ||
|
|
@@ -196,6 +198,12 @@ async def _inner(ctx: ServerRequestContext[LifespanT, Any]) -> HandlerResult: | |
| if isinstance(result, ErrorData): | ||
| # Raise inside the chain so middleware observes the failure. | ||
| raise MCPError.from_error_data(result) | ||
| # Fill cache hints on the typed result, before the serialize sieve | ||
| # decides whether the negotiated version carries the fields at all. | ||
| # `input_required` interim results are not `CacheableResult` models, | ||
| # so the MRTR carve-out (no hints on them) holds by shape. | ||
| if isinstance(result, CacheableResult) and (hint := self.server.cache_hints.get(method)) is not None: | ||
| result = apply_cache_hint(result, hint) | ||
|
Comment on lines
+201
to
+206
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟡 When a lowlevel handler (registered via Extended reasoning...What the bug is. |
||
| # Dump and serialize inside the chain so the OpenTelemetry span (the | ||
| # outermost middleware) records a failing handler return shape too. | ||
| return self._serialize(method, version, result) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 The docstring caveat "A handler constructing results with
model_constructbypasses that tracking and is treated as having set nothing" is factually wrong: pydantic'smodel_constructdefaults__pydantic_fields_set__to the keys actually passed, so an explicitttl_ms/cache_scopepassed viamodel_constructstill wins over the configured hint, exactly like normal construction. The runtime behavior is correct — only the sentence needs correcting or dropping (it only holds when a handler passes_fields_set=set()explicitly).Extended reasoning...
What the docstring claims vs. what pydantic does.
apply_cache_hint's docstring (src/mcp/server/caching.py:56-63) says a handler that builds its result withmodel_construct"bypasses that tracking and is treated as having set nothing." In pydantic v2, however,model_constructonly treats the result as having set nothing when_fields_setis explicitly passed (e.g._fields_set=set()). When_fields_setis omitted — the common case — pydantic computes__pydantic_fields_set__from the keyword arguments actually passed, the same as normal validation-based construction.Concrete proof (verified against the installed pydantic 2.12.5):
Step by step: (1)
model_construct(tools=[], ttl_ms=10)is called with no_fields_set; (2) pydantic populates__pydantic_fields_set__ = {'tools', 'ttl_ms'}from the passed keys; (3)apply_cache_hintchecks"ttl_ms" not in result.model_fields_set, which is False, so it leavesttl_ms=10alone and only fillscache_scope. That is identical to the behavior for a normally-constructedListToolsResult(tools=[], ttl_ms=10)— there is no "bypass".Why it matters. The runtime code is correct and consistent with the documented precedence model (explicit beats configured, per field), so this is not a behavioral bug. But the caveat could mislead lowlevel-handler authors into believing that using
model_constructcauses their explicitly-setttl_ms/cache_scopeto be clobbered by the server-widecache_hintsmap, prompting unnecessary workarounds (or avoidance ofmodel_constructaltogether). The only scenario the sentence accurately describes is a handler that passes_fields_set=set()explicitly, or one that omits the fields entirely — and the latter is true of normal construction too.How to fix. Either drop the sentence, or reword it to describe the actual edge case, e.g.: "A handler that calls
model_constructwith an explicit_fields_setthat omits these fields is treated as having set nothing for them; by defaultmodel_constructtracks the fields actually passed, so explicit values still win."All three verifiers independently confirmed the factual claim empirically against pydantic 2.12.5 in this repo; none refuted it. Docstring-only, non-blocking.