Work through the spec-conformance gaps recorded by the interaction suite#2999
Work through the spec-conformance gaps recorded by the interaction suite#2999maxisbey wants to merge 14 commits into
Conversation
5b9582e to
abf4ce4
Compare
There was a problem hiding this comment.
2 issues found across 29 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/mcp/shared/auth_utils.py">
<violation number="1" location="src/mcp/shared/auth_utils.py:32">
P2: Audience canonicalization can throw `ValueError` on malformed resource ports, breaking auth flow with a server error.</violation>
<violation number="2" location="src/mcp/shared/auth_utils.py:83">
P2: Audience validation is over-permissive because trailing slashes are stripped before comparison, so different resource URIs like `/api` and `/api/` are treated as equivalent.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Fix all with cubic | Re-trigger cubic
| origin is NOT for this server. Contrast check_resource_allowed, which is the | ||
| client-side hierarchical question and intentionally more permissive. | ||
| """ | ||
| return resource_url_from_server_url(token_resource).rstrip("/") == resource_url_from_server_url( |
There was a problem hiding this comment.
P2: Audience validation is over-permissive because trailing slashes are stripped before comparison, so different resource URIs like /api and /api/ are treated as equivalent.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/mcp/shared/auth_utils.py, line 83:
<comment>Audience validation is over-permissive because trailing slashes are stripped before comparison, so different resource URIs like `/api` and `/api/` are treated as equivalent.</comment>
<file context>
@@ -65,6 +72,19 @@ def check_resource_allowed(requested_resource: str, configured_resource: str) ->
+ origin is NOT for this server. Contrast check_resource_allowed, which is the
+ client-side hierarchical question and intentionally more permissive.
+ """
+ return resource_url_from_server_url(token_resource).rstrip("/") == resource_url_from_server_url(
+ server_resource
+ ).rstrip("/")
</file context>
There was a problem hiding this comment.
You are right at the RFC 3986 layer — strict equivalence distinguishes /api/ from /api. But RFC 8707 defines no comparison algorithm, and its own examples use trailing-slash resource URIs. Two things make the normalization load-bearing: pydantic's AnyHttpUrl renders a root resource as https://h/ while the spec's example token request sends the slashless form — strict comparison would 401 every conformant client of a root-path deployment. And the compared value is the verifier-asserted audience from the validated token, not raw client input; exploiting the leniency would need two distinct resource servers at one origin differing only by a trailing slash, which a Starlette mount cannot host. Keeping the rstrip; c53aefd adds a comment plus tests pinning the equivalence and the sibling/child-path rejections.
There was a problem hiding this comment.
The parent comment was wrong here: this audience comparison is intentionally load-bearing for RFC 8707 resource validation. The root-path and verifier-asserted-audience cases make the trailing-slash normalization necessary, so keeping the rstrip('/') is correct.
There was a problem hiding this comment.
1 issue found across 36 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/mcp/shared/auth_utils.py">
<violation number="1" location="src/mcp/shared/auth_utils.py:83">
P2: Audience validation is over-permissive because trailing slashes are stripped before comparison, so different resource URIs like `/api` and `/api/` are treated as equivalent.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Fix all with cubic | Re-trigger cubic
8c6b78b to
0dec196
Compare
…haviour - resources:annotations: drop the stale lastModified divergence; the model now carries the field and it round-trips. - lifecycle:capability:client-not-declared: narrow to the one remaining gap (the deprecated send_roots_list_changed path); the handler half is correct by construction. - lifecycle:pre-initialization-ordering: mark removed_in=2026-07-28; the initialize handshake is gone in the new spec. - client-transport:http:session-404-reinitialize: reword as an intentional cross-SDK choice (404 surfaced to caller), not a missed MUST. - test_resources: drop the now-redundant lastModified xfail.
BearerAuthBackend / RequireAuthMiddleware now produce spec-conformant challenges and reject tokens issued for a different resource server. - A request with no credentials gets a bare `Bearer` challenge (with scope/resource_metadata only), not error="invalid_token" -- RFC 6750 Section 3.1 says the error attribute SHOULD NOT appear when no authentication information was presented. - A malformed/unknown token, an expired token, or a token whose audience does not match the configured resource_server_url is answered 401 invalid_token with a specific error_description, carried via a new InvalidTokenUser marker so the middleware can distinguish it from no-credentials. - All challenges (401 and the 403 insufficient_scope path) now advertise the required scopes in a `scope=` parameter, which the SDK client already reads to drive step-up. - New check_token_audience() helper canonicalises default ports before comparing, and is wired through both the lowlevel and MCPServer Starlette stacks via the auth settings' resource_server_url. Docs and migration guide updated; the corresponding interaction-suite divergence entries are now closed.
… to -32603 A request cancelled via notifications/cancelled now gets no response: when the handler scope's cancel is caught, JSONRPCDispatcher returns instead of writing an error. The sender retired its own waiter when it cancelled, so no reply is needed to unblock it. An unhandled exception in a request handler now produces JSON-RPC error -32603 (INTERNAL_ERROR) with the opaque message "Internal server error" instead of code 0 carrying str(exc). The exception is still logged server-side. To send a specific code/message, raise MCPError; pydantic ValidationError still maps to INVALID_PARAMS. handler_exception_to_error_data is now total: it returns MappedError(error: ErrorData, unexpected: bool) for every Exception. Callers gate logger.exception / raise_handler_exceptions on the unexpected flag rather than re-deriving the rung set, so a handler that deliberately raises MCPError(code=INTERNAL_ERROR) is not treated as a crash. JSONRPCDispatcher, DirectDispatcher, and the modern HTTP entry's _to_jsonrpc_response all call the one helper; DirectDispatcher no longer hand-rolls its own ladder. The interaction suite's protocol:cancel:in-flight and protocol:error:internal-error requirements drop their divergence entries and modern-error-surface arm exclusions; the two prompt-validation divergence notes are reworded for the new -32603 surface. docs/migration.md gains a section covering both behaviour changes.
When `AuthSettings.resource_server_url` is configured, `BearerAuthBackend`
previously ran the RFC 8707 audience comparison only for tokens whose
verifier populated `AccessToken.resource`: a token carrying no resource
indicator at all was accepted. The MCP authorization spec requires a
resource server to only accept tokens issued specifically for it, so the
gate now fails closed: a verified token with no `resource` is answered
`401 invalid_token` ("The access token carries no audience claim").
`resource_server_url=None` still means there is no audience to enforce.
For verifiers that validate the audience themselves and cannot surface
the claim (for example a JWT decoder configured with the expected
audience), the new `AuthSettings.verifier_validates_audience=True` opts
the gate out. The `AuthSettings.enforced_audience` property derives the
single value both server wirings pass to `BearerAuthBackend`, whose
signature is unchanged.
`RefreshToken` gains an optional `resource` field so an authorization
server provider can carry the original grant's audience binding through
`exchange_refresh_token`; without it every refreshed access token would
be audience-unbound and rejected by the hardened gate.
The docs tutorials and example servers now populate
`AccessToken.resource` (and the client-credentials demo token endpoint
honors the RFC 8707 `resource` parameter) so they pass the check they
teach. The migration guide entry for audience validation is rewritten
for the fail-closed behavior.
A client constructed without an elicitation_callback or list_roots_callback still answers the server's request, via a default callback that returns a JSON-RPC error. Both defaults used -32600 (invalid request). The spec assigns a specific code to each case: - elicitation/create: -32602 (invalid params). A client with no callback declares no elicitation modes, so every incoming request names an undeclared mode, which clients MUST answer with -32602. - roots/list: -32601 (method not found), the code clients SHOULD use when they do not support roots. The default sampling callback keeps -32600: the spec assigns no code to a client that does not support sampling. Error messages are unchanged. Update the affected tests and the interaction-requirement entries that recorded the old codes, fix the code named in the client callbacks doc page, and add a migration note.
Two fixes to the optional bundled OAuth authorization server (the
`auth_server_provider=` path).
The registration endpoint accepted any well-formed URL as a
`redirect_uris` entry: cleartext `http://` on a non-loopback host,
`javascript:`, `data:`, and URIs carrying a fragment all registered
successfully. The MCP authorization specification's Communication
Security section requires every redirect URI to be either localhost or
HTTPS, and OAuth 2.1 section 2.3 forbids a fragment component. Such an
entry is now rejected with `400 invalid_client_metadata`. Loopback is
exactly the three forms OAuth 2.1 section 8.4.2 names (`localhost`,
`127.0.0.1`, `[::1]`), on any port; query strings remain permitted.
This also rejects RFC 8252 private-use schemes such as
`com.example.app:/callback`: MCP restricts redirect URIs to HTTPS or
loopback, with no carve-out for native apps.
The rule lives on the request model: `RegistrationRequest`, until now a
dead alias of `OAuthClientMetadata`, becomes a real subclass with a
`redirect_uris` field validator, so a forbidden URI fails parsing and
takes the handler's existing `invalid_client_metadata` arm rather than
needing a post-parse check. `OAuthClientMetadata` itself is unchanged:
the client also serializes it when registering against third-party
authorization servers whose redirect-URI policies the SDK does not own.
The loopback host set moves to a single `LOOPBACK_HOSTS` constant in
`mcp.server.auth.provider`, shared with `validate_issuer_url`, which
previously inlined the same tuple.
Separately, the token endpoint now answers an authorization-code
exchange whose `redirect_uri` does not match the one used at
`/authorize` with `error=invalid_grant` instead of `invalid_request`.
RFC 6749 section 5.2 assigns this case to `invalid_grant` ("does not
match the redirection URI used in the authorization request"), and the
handler's other authorization-code failures already use it. The
exchange was already rejected with HTTP 400; only the `error` field
changes.
Update the affected tests and the interaction-requirement entries, and
add a migration note.
Closes #2629
Two changes to how the OAuth client uses discovered authorization-server metadata, both required by the MCP authorization specification. Verify PKCE support before the authorization-code grant. The spec's Authorization Code Protection section requires clients to verify PKCE support from the authorization server's metadata and to refuse to proceed when code_challenge_methods_supported is absent. The new validate_pkce_support() also refuses a method list that omits S256, since that is the only method this client sends. The check sits at the top of _perform_authorization_code_grant, so it covers both the initial 401 flow and the 403 insufficient_scope step-up, while grants that never issue an authorization code (client credentials, private key JWT) are unaffected. When no metadata document was discovered at all the flow proceeds as before: absence of a document is not evidence of non-support. Stop reading scopes_supported from authorization-server metadata when selecting a scope. The spec's scope-selection chain is the WWW-Authenticate scope parameter, then the protected-resource metadata's scopes_supported, otherwise omit the scope parameter. The SDK inserted an extra fallback to the authorization server's scopes_supported, which over-requests (an authorization server may serve many resource servers, so its list is a superset of any one resource's) and causes access_denied failures against servers that reject unknown scopes. With the fallback removed, clients that relied on it should pass an explicit scope on their OAuthClientMetadata. Closes #1307
A server that had already completed the initialize handshake on a connection answered a repeated initialize request on that same connection as a fresh handshake, silently overwriting the session's recorded client_params and negotiated protocol version. A check_capability call made after that point then answered against the second client's declared capabilities. The handshake now commits at most once per connection: a repeated initialize is answered with JSON-RPC error -32600 (INVALID_REQUEST, "Session already initialized") and the established session keeps serving. The check lives at the runner's request-dispatch boundary, right next to its mirror (the request-before-initialize gate), so every transport gets it without any transport-level body sniffing. The discriminator is client_params rather than initialize_accepted: the legacy stateless path builds a per-request connection that is born past the gate but has no peer info yet, and its one initialize must still be accepted. No compliant client is affected. The spec makes initialization the first interaction of a session, and ClientSession.initialize() is already idempotent (a repeat call returns the first result without sending anything). This only applies to the legacy (2025-11-25 and earlier) handshake; the 2026-07-28 protocol removes initialize entirely. Closes #2605
The never-called prompt body in the wrong-type-argument test proves it never ran by appending to a closure-captured list, but a plain append on an unreachable line would fail the 100% coverage gate. The append therefore rides on a `raise NotImplementedError` line, which coverage's `exclude_also` strips. That shape read as an accident; add a comment spelling out why it is written that way.
Client(..., strict_capabilities=True) rejects, before any request reaches the transport, a call to a method whose required server capability the connected server did not advertise -- for example list_resources() against a server that only advertised tools, or subscribe_resource() when the server's resources capability does not set subscribe. The rejection is an MCPError with code -32601 (METHOD_NOT_FOUND) and data set to the method, the same shape a compliant server returns for an unadvertised capability, so opting in changes where the rejection happens, not what callers catch. The default is False and unchanged: every request is sent and the server's answer is surfaced. This mirrors the TypeScript SDK's enforceStrictCapabilities option (also default-off). The same keyword-only parameter exists on ClientSession for low-level users; Client forwards it. The method-to-capability table lives in mcp_types.methods.SERVER_CAPABILITY_REQUIREMENTS next to the other per-method maps, with missing_server_capability() as its only evaluator, so the check in ClientSession.send_request is a single data-driven gate rather than a per-method condition, and the relationship is the same at every protocol version. Because the gate reads server_capabilities, a bare version pin (mode="2026-07-28" with no prior_discover=) would reject every gated method; that combination is refused at Client construction with a ValueError that names the fix. The interaction-requirements entry for the lifecycle capability rule is no longer marked untested: the new tests pin both the opt-in pre-wire rejection and the default send-and-surface behaviour.
`ServerSession.send_progress_notification` takes an explicit progress token
decoupled from the request it belongs to, so it can keep emitting progress
for a request that has already completed -- which the spec forbids
("Progress notifications MUST stop after completion"). The request-scoped
`report_progress` (and `Context.report_progress`) is the supported path: it
reports against the inbound request's own token, no-ops when the caller did
not ask for progress, and stops when the request completes. The deprecated
method keeps working and emits `MCPDeprecationWarning`.
The warning message deliberately departs from the "<X> is deprecated as of
<version>" pattern used by the spec-driven deprecations: this one is an SDK
API decision, not a spec retirement (2026-07-28 does not retire
server-to-client progress).
For "stops when the request completes" to hold on every dispatcher,
`_DirectDispatchContext` now closes with its request the way
`_JSONRPCDispatchContext` already did: `close()` runs in the dispatch
handler's `finally`, after which `progress`/`notify` deliver nothing,
`can_send_request` is False, and `send_raw_request` raises
`NoBackChannelError` -- the closed state the `DispatchContext` protocol
documents. Two pre-existing tests that asserted `can_send_request` on a
context captured after its handler returned now sample it in-handler, and
the closed-state contract tests are parametrized over both dispatchers.
The interaction test that covered both the server and client side of late
progress is split in two. The server-side property is proved positively on
the wire through `report_progress`; it no longer relies on a session-bound
standalone stream, so it also runs on the stateless streamable-http arm.
The client-side late-drop test keeps using the deprecated explicit-token
method -- the only API that can still produce a late notification -- under
`pytest.warns`, and its arms are unchanged. The migration guide stops
recommending the deprecated method anywhere and documents the replacement.
ElicitRequestedSchema was a TypeAlias for dict[str, Any]; it is now a
Pydantic model of the spec's restricted requested-schema subset, backed
by a new PrimitiveSchemaDefinition union (StringSchema, NumberSchema,
BooleanSchema, and the enum schemas). ServerSession.elicit_form (and the
deprecated elicit alias) and ClientPeer.elicit_form accept only this
model, so a nested-object property, an array-of-objects property, or an
anyOf union is unconstructible at the only place a server author
supplies a schema, rather than silently forwarded to the client.
The spec restricts form-mode requested schemas to flat objects with
primitive-typed properties only ("complex nested structures, arrays of
objects ... are intentionally not supported"). The high-level
Context.elicit / elicit_with_validation path is unchanged in behaviour:
it converts the rendered JSON Schema into the typed model, keeping its
existing per-field TypeError contract and producing value-identical wire
output.
Inbound is deliberately untouched. The wire field
ElicitRequestFormParams.requested_schema stays a plain dict[str, Any],
so older servers that emit anyOf for Optional form fields still reach
the client's elicitation callback. The typed-model-to-wire-dict
conversion lives in one place, ElicitRequestedSchema.to_wire(), which
both send sites call.
The schema family is extra="allow": keys schema.ts does not name (a
top-level title, pattern, exclusiveMinimum, json_schema_extra keys)
still round-trip, because the primitives-only restriction is carried by
the union members' required type literals, not by extra-key rejection.
Two interaction-suite divergence notes described their gaps as unenforced spec MUSTs without saying whether closing them was planned. Both are deliberate, and the notes now say so. transport:stdio:stream-purity -- stdio_server does not redirect sys.stdout, so a handler print() corrupts the protocol stream. No MCP SDK redirects stdout, and a redirect would only catch print(), not os.write(1, ...) or C extensions writing to file descriptor 1, so it would be a partial guard rather than a structural fix. docs/tutorial/logging.md already tells server authors to log to stderr and never print() in a stdio server, so no docs change was needed. protocol:progress:monotonic -- no MCP SDK validates sender-side progress monotonicity; the spec MUST is a contract on the handler author, not on the transport, and the test pins the unvalidated pass-through. No code or test behaviour changes.
0dec196 to
c2b3e8e
Compare
Review-feedback round on the conformance burn-down: - Cancelled requests no longer leave the legacy streamable-HTTP POST hanging. The dispatcher emits a RequestSettled marker when a handler is cancelled without producing a response; the transport consumes it by closing the per-request stream, so the POST's SSE stream terminates without a response frame and JSON-response mode completes with 204 No Content (the client treats 202/204 alike). Per-request streams are released instead of leaking until session teardown, and a handler that survives the cancellation still delivers its normal response. The marker is type-visible on the dispatcher write stream and is stripped by every serializing transport, so it can never appear on a wire. - A bearer token whose audience cannot be canonicalized (out-of-range or non-numeric port) is now rejected with the standard 401 invalid_token instead of raising through the auth middleware as a 500. - The bundled authorization server's /register now accepts only https redirect URIs or http on a loopback host; other schemes on loopback hosts (ftp, ws, javascript, custom) are rejected. - OAuth client scope selection falls back to the caller-configured OAuthClientMetadata.scope when neither the WWW-Authenticate challenge nor protected-resource metadata names scopes, matching the TypeScript SDK, so the documented migration path works as written. - The cross-dispatcher contract that handler-raised MCPError subclasses surface to callers as plain MCPError is now pinned by an explicit test and documented; rehydrate with from_error when the subclass matters. - Docs: migration notes for the bearer-challenge wire-shape changes and the cancellation wire spellings; story READMEs updated to the landed error contract; strict-capabilities doc corrected to state that resources/unsubscribe is gated by the base resources capability only.
There was a problem hiding this comment.
1 issue found across 46 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/mcp/shared/direct_dispatcher.py">
<violation number="1" location="src/mcp/shared/direct_dispatcher.py:111">
P3: This overstates the DirectDispatcher contract: notification handlers are not flattened to plain MCPError. Narrow the doc to request dispatch paths, or update notification handling separately.</violation>
</file>
Tip: Review your code locally with the cubic CLI to iterate faster.
Fix all with cubic | Re-trigger cubic
| handler. Notifications are fire-and-forget in both directions: after close | ||
| they are silently dropped. | ||
|
|
||
| Handler-raised `MCPError` subclasses flatten to plain `MCPError` with equal |
There was a problem hiding this comment.
P3: This overstates the DirectDispatcher contract: notification handlers are not flattened to plain MCPError. Narrow the doc to request dispatch paths, or update notification handling separately.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/mcp/shared/direct_dispatcher.py, line 111:
<comment>This overstates the DirectDispatcher contract: notification handlers are not flattened to plain MCPError. Narrow the doc to request dispatch paths, or update notification handling separately.</comment>
<file context>
@@ -107,6 +107,11 @@ class DirectDispatcher:
handler. Notifications are fire-and-forget in both directions: after close
they are silently dropped.
+
+ Handler-raised `MCPError` subclasses flatten to plain `MCPError` with equal
+ `ErrorData` on every dispatch path, matching the wire, where subclass
+ identity cannot survive; callers needing the subclass rehydrate it from
</file context>
| if response.status_code in (202, 204): | ||
| # 202: notification/response accepted. 204: the server settled the | ||
| # request with no reply (peer-cancelled); nothing to deliver — the | ||
| # waiter was retired at cancel time. | ||
| logger.debug(f"Received {response.status_code}") | ||
| return | ||
|
|
There was a problem hiding this comment.
🔴 The client now silently swallows any 204 on a request POST (src/mcp/client/streamable_http.py:268-274) on the assumption the waiter was already retired at cancel time, but the server's JSON-response mode also sends 204 when the per-request stream closes for non-cancel reasons (session teardown mid-request, idle-timeout eviction, operator DELETE), and a non-SDK server may answer a request POST with 204 outright — in those cases the client's waiter is still pending and nothing is ever delivered, so with the default read_timeout_seconds=None the caller hangs forever where it previously got a fail-fast error. Consider only swallowing the 204 when the request id is no longer pending, and otherwise retiring the waiter with a synthesized error (a synthesized error for an already-retired cancel waiter is dropped harmlessly).
Extended reasoning...
The bug. _handle_post_request in src/mcp/client/streamable_http.py (lines 268–274) now treats 202 and 204 identically: it logs and returns, with the comment that for a 204 "the waiter was retired at cancel time." That justification only holds when this client initiated the cancellation (its send_raw_request finally retires the pending waiter before the courtesy notifications/cancelled goes out). A 204 has other producers for which the waiter is very much still pending, and for those the silent return means nothing — no response, no error — is ever delivered to the dispatcher for that request id.
The non-cancel 204 producers. On the SDK server side, the JSON-response-mode wait loop in src/mcp/server/streamable_http.py (~lines 564–587) sends 204 whenever the per-request stream closes without a JSONRPCResponse/JSONRPCError frame — and its own comment names two causes: "peer-cancelled … or the session tore down mid-request." terminate() (and connect()'s cleanup) closes the send side of every in-flight per-request stream without routing any reply, so a server-initiated session termination — StreamableHTTPSessionManager's session_idle_timeout evicting a session whose handler outlives the idle deadline, an operator/external DELETE on the session id, or session-manager shutdown — while a client request is in flight produces exactly this 204. Separately, a non-SDK server may (non-compliantly) answer a request POST with 204 directly. In none of these cases did the client cancel anything.
Why nothing else saves the caller. The dispatcher only fails pending waiters when the ClientSession itself closes; the client's read stream stays open after the swallowed 204, and session death is only discovered by later requests. Client and ClientSession both default read_timeout_seconds=None, so the original caller's send_request blocks indefinitely.
Pre-PR vs post-PR (the regression). Before this PR, the same server condition produced a 500 ("No response message received before stream closed"), which the client's ≥400 path mapped to an INTERNAL_ERROR JSONRPCError delivered to the waiter — fail fast. A non-SDK 204 to a request POST hit the unexpected-content-type/parse path and likewise failed the waiter. Both paths now regress from "caller gets an error" to "caller hangs forever." The new test coverage only exercises the cancel flow (where the waiter is already retired), so the regression is not pinned by the suite.
Step-by-step proof. (1) Server runs streamable HTTP with json_response=True and an idle timeout; client issues tools/call (id 1) whose handler runs longer than the idle deadline. (2) The session manager's idle deadline fires (it is only pushed forward by inbound requests), cancels the serve loop, and calls terminate(), which closes every per-request stream; the parked JSON wait loop exits with response_message=None and the POST completes 204. (3) The client's _handle_post_request sees 204, logs "Received 204", and returns — the waiter for id 1 was never retired because the client never cancelled. (4) Nothing else ever resolves that waiter; with the default read_timeout_seconds=None the caller's await client.call_tool(...) blocks forever. Pre-PR, step (2) yielded a 500 and step (3) delivered an INTERNAL_ERROR to the waiter immediately.
Mitigating factors. json_response=True is non-default; the default SSE mode already had an analogous silent-hang shape pre-PR (stream ends with no event id → task returns); and the mainline cancel flow handled by this PR is correct and requirement-pinned (client-transport:http:204-settled-exchange). So this is a narrow trigger — but it is a hang-vs-error regression introduced by this PR, and an indefinite hang is strictly worse than the prior error surface.
How to fix. Only swallow the 204 when the request id is no longer pending in the dispatcher (i.e. this client cancelled it), and otherwise retire the waiter by writing a synthesized JSONRPCError for message.id to read_stream_writer (e.g. INTERNAL_ERROR, "Request settled by server with no response"). In the legitimate cancel case the synthesized error is harmless — the waiter is already retired and late responses for unknown ids are dropped — so unconditionally synthesizing on 204 for request POSTs would also restore fail-fast without breaking the cancel flow.
| async def message_router(): | ||
| try: | ||
| async for session_message in write_stream_reader: # pragma: no branch | ||
| if isinstance(session_message, RequestSettled): | ||
| # The dispatcher ended this request with no reply (peer-cancelled). | ||
| # Close the per-POST stream's send side: the SSE writer / JSON wait | ||
| # loop finish their iteration and clean up in their own finally. | ||
| # Nothing is stored or sent on the wire. Keyed by str(id), matching | ||
| # the registration at POST time. | ||
| if (streams := self._request_streams.get(str(session_message.request_id))) is not None: | ||
| streams[0].close() | ||
| continue |
There was a problem hiding this comment.
🔴 With resumability (an event_store) enabled, a cancelled or timed-out request re-creates the hang/leak the new RequestSettled marker was added to fix: the priming event gives the client a last_event_id, so when the settled POST stream ends with no response it reconnects with Last-Event-ID, and _replay_events (having nothing to replay and nothing recorded for the settled id) re-registers a fresh _request_streams entry and parks forever — leaving one permanently open GET, a dead per-request stream on the server, a parked client transport task, and an extra priming row in the event store per cancellation. Pre-PR the stored code-0 'Request cancelled' error terminated both sides even with resumability, so this is a regression on that arm; a fix is to record a terminator (or drop the stream) in the event store for a settled id, have _replay_events end the GET instead of re-registering an id with no live request, and/or have the client skip reconnection when the only event seen was the priming event for an already-retired waiter.
Extended reasoning...
The bug. The new RequestSettled marker (emitted by the dispatcher when a peer cancellation interrupts a handler) is consumed in message_router (src/mcp/server/streamable_http.py:978-989): it closes the per-POST stream's send side and deliberately stores nothing in the event store ('Nothing is stored or sent on the wire'). That cleanly ends the exchange when no event store is configured — which is exactly the case the client's new no-reconnect early-out covers (src/mcp/client/streamable_http.py:383-388, whose own comment says 'No priming event (no event store) means no id'). With an event store configured, the same cancellation re-creates the hang the marker was added to fix, plus a per-cancellation server-side leak.
The code path. (1) A 2025-11-25+ client POSTs tools/call (id=N) to a server constructed with an event_store. _handle_post_request mints a priming event with an SSE id (_mint_priming_event, src/mcp/server/streamable_http.py:285-301) and _run_sse_writer sends it first; the client records it as last_event_id (_handle_sse_response, src/mcp/client/streamable_http.py:362-363; _handle_sse_event treats the empty-data priming event as non-complete). (2) The caller times out or scope-cancels; the dispatcher sends the courtesy notifications/cancelled, the handler is interrupted, and the dispatcher emits RequestSettled. message_router closes _request_streams[N][0], the SSE writer ends, its finally removes the _request_streams entry, and the POST exchange closes — no response frame, nothing written to the event store. (3) On the client, _handle_sse_response sees the stream end without a response, but last_event_id is not None, so the early-out does not apply: it calls _handle_reconnection, which re-GETs /mcp with Last-Event-ID. (4) On the server, _handle_get_request routes the header to _replay_events (src/mcp/server/streamable_http.py:854-917): replay_events_after finds nothing after the priming id, returns the stream id, which is no longer in _request_streams, so replay_sender re-registers it, mints and stores another priming event row, and parks forever in async for event_message in msg_reader — the dispatcher already settled that id, will never write a response for it, and message_router only closes per-request streams on a RequestSettled, which will not recur for that id. (5) The client's transport task likewise parks in the reconnected GET; if that connection ever drops on a normal stream end, _handle_reconnection recurses with attempt=0, so MAX_RECONNECTION_ATTEMPTS does not bound it and it polls indefinitely.
Why nothing prevents it. The client-side early-out is explicitly scoped to the no-event-store case; the RequestSettled marker is intentionally never persisted, so a resuming client has no way to learn the request settled; and _replay_events interprets 'stream id not in _request_streams' as 'resume a live stream' rather than 'this request is gone'. The PR's new cancellation tests (hosting:http:cancel-ends-post-sse-stream, cancel-json-mode-204, and the client-side 204 test) all run without an event store, so the resumability arm is untested.
Step-by-step proof. Server: StreamableHTTPSessionManager(..., event_store=InMemoryEventStore()), one tool that blocks on an anyio.Event. Client: streamable_http_client + Client, call the blocking tool inside a cancel scope and cancel it after the handler starts (the same shape as the PR's new cancel tests, just with an event store). Sequence: priming event id=E1 arrives → client sets last_event_id=E1 → cancel → courtesy notifications/cancelled → dispatcher emits RequestSettled(N) → message_router closes the per-POST stream, exchange ends → client sees no response but last_event_id=E1, so it GETs with Last-Event-ID: E1 → replay_events_after(E1) replays nothing, returns stream id N → not in _request_streams → re-register, store priming event E2, park in msg_reader forever. Inspecting the transport afterwards shows a re-registered _request_streams[N] entry and an open GET that never closes; the client task never returns. Repeat per cancellation/timeout — connections, tasks and event-store rows accumulate until whole-session teardown. Before this PR, step 'dispatcher emits RequestSettled' was instead 'dispatcher writes JSONRPCError(code=0, "Request cancelled")', which message_router stored in the event store and routed to the stream, terminating both sides cleanly even with resumability.
Impact. Per cancelled or timed-out request on a resumability-enabled server: one permanently open GET SSE connection, a re-registered dead _request_streams entry server-side (released only at session teardown), one permanently parked client transport task, and an extra priming-event row written to the event store on each reconnect. Ordinary client request timeouts trigger the courtesy cancel, so this is hit by routine timeout handling, not just explicit user cancellation, whenever the documented resumability feature is on.
How to fix. Any of: have message_router (or the dispatcher path) also drop/terminate the event-store stream for a settled request id so a resume has something definitive to replay; have _replay_events end the GET (instead of re-registering and parking) when the replayed stream id has no live request stream and no pending request; and/or have the client not reconnect when the only event ever received was the priming event for a request whose waiter has already been retired. Adding a resumability-enabled variant of the new cancellation interaction tests would lock the behaviour in.
The
tests/interaction/suite is requirement-tagged against the MCP specification. Wherethe SDK's behaviour fell short of the spec, the suite did not skip or xfail: it pinned the
actual behaviour and recorded the gap on the requirement entry. This PR is a burn-down of
those recorded gaps. It fixes fourteen of them in code (and deletes a fifteenth whose
behaviour had already been fixed but whose record had gone stale), narrows three more to
a precisely-stated residual, records two as deliberate, ecosystem-consistent choices, and
closes three long-standing upstream issues along the way. It spans the server's bearer-token
gate, the bundled OAuth authorization server, the OAuth client's use of discovered metadata,
the JSON-RPC error surface for cancellation / handler crashes / unhandled client callbacks,
the
initializehandshake, an opt-in client-side capability pre-check, the explicit-tokenprogress API, and the elicitation requested-schema type. For every behaviour change, the
existing requirement-tagged test flips from pinning the gap to pinning the spec behaviour, so
the suite stays the source of truth for what conforms and what does not.
Commits
68e4eb28Tighten requirements-catalog divergence notes to match current SDK behaviourbad42e28Emit RFC 6750 scope= in WWW-Authenticate and validate token audience401/403challenges now advertise the required scopes viascope=(which the SDK client already reads to drive step-up); a request with no credentials gets a bareBearerchallenge instead oferror="invalid_token"(RFC 6750 §3.1 says theerrorattribute SHOULD NOT appear when nothing was presented); and a token whose RFC 8707 audience names a different server is now rejected401 invalid_token.47639a2bStop replying to cancelled requests; map unhandled handler exceptions to -32603notifications/cancellednow gets no response, per the spec's SHOULD (it was answered withcode: 0). An unhandled exception in a request handler now produces JSON-RPC-32603(INTERNAL_ERROR) with an opaque message instead ofcode: 0carryingstr(exc)— handler internals no longer leak to the peer; the exception is still logged server-side.raise MCPError(...)for a specific code is unchanged.a1734460Reject bearer tokens that carry no audience claimresource_server_urlis configured the gate now fails closed (401 invalid_token).AuthSettings(verifier_validates_audience=True)opts out for verifiers that enforceaudthemselves and cannot surface the claim;RefreshToken.resourceis added so the audience binding survives a refresh. The SDK's own tutorials and example servers were teaching the bug and are fixed in the same commit.884badc9Use spec error codes for unhandled elicitation/create and roots/listelicitation_callback/list_roots_callback, the SDK still answers the server on its behalf. Those answers now use the codes the spec assigns —-32602forelicitation/create(a client with no callback declared no modes, and a request for an undeclared mode MUST be answered-32602) and-32601forroots/list— instead of-32600for both. Messages unchanged; sampling stays at-32600(the spec assigns no code there).24061fe9Reject non-HTTPS, non-loopback redirect URIs at client registration/registeraccepted any well-formed URL as aredirect_urisentry — cleartexthttp://on a non-loopback host,javascript:,data:, fragment-carrying URIs. It now rejects anything that is not HTTPS or a loopback host (localhost,127.0.0.1,[::1]) with400 invalid_client_metadata, and rejects fragments (OAuth 2.1 §2.3). Separately,/tokennow answers an authorization-code exchange whoseredirect_uridoes not match the one used at/authorizewitherror=invalid_grant(RFC 6749 §5.2) instead ofinvalid_request. Closes #2629.3eb352c8Align OAuth client with spec on PKCE verification and scope selectioncode_challenge_methods_supportedis absent or omitsS256, the flow raisesOAuthFlowErrorinstead of redirecting. (b) The client stops reading the authorization server'sscopes_supportedas a scope fallback — that list is a superset of any one resource's, so the SDK over-requested and triggered realaccess_deniedfailures. The spec's chain isWWW-Authenticate scope=→ PRMscopes_supported→ omit. Closes #1307.1e2bd9beReject a second initialize on an already-initialized sessioninitializeon a live connection was answered as a fresh handshake, silently overwriting the session's recordedclient_paramsand negotiated protocol version — so a latercheck_capabilityanswered against the second client's capabilities. It is now rejected-32600("Session already initialized") and the established session keeps serving. No compliant client is affected (ClientSession.initialize()is already idempotent). Legacy-only: 2026-07-28 removesinitialize. Closes #2605.8f60ebe5Make the prompt test's opaque coverage workaround explicit28f500c7Add an opt-in strict_capabilities flag to Client and ClientSessionClient(..., strict_capabilities=True)rejects, before the request reaches the transport, a call to a method whose required server capability the connected server did not advertise — e.g.list_resources()against a server that only advertisedtools, orsubscribe_resource()when the server'sresourcescapability does not setsubscribe. The rejection is anMCPErrorwith-32601(METHOD_NOT_FOUND), the same code a compliant server returns for an unadvertised capability, so opting in changes where the rejection happens, not what callers catch. Mirrors the TypeScript SDK'senforceStrictCapabilities(also default-off); the same keyword-only parameter exists onClientSession. Because the gate reads the negotiated server capabilities, combining it with a bare version pin (where the client never learns them) is refused at construction with aValueErrorthat names the fix. The method-to-capability table is exported asmcp_types.methods.SERVER_CAPABILITY_REQUIREMENTS, and the lifecycle capability-rule requirement entry is no longer marked untested.eddfa29dDeprecate ServerSession.send_progress_notificationsend_progress_notificationtakes an explicit progress token decoupled from any request's lifetime, so it can keep emitting progress for a request that has already completed — which the spec forbids ("Progress notifications MUST stop after completion"). It is deprecated in favour of the request-scopedContext.report_progress()/ServerSession.report_progress(), which reports against the inbound request's own token, no-ops when the caller did not ask for progress, and is closed with the request. The deprecated method keeps working and emitsMCPDeprecationWarning. To make "stops when the request completes" hold on every dispatcher, the in-process direct dispatcher now closes its context with the request the way the JSON-RPC one already did.3605ec09Type the elicitation requested schema on the send sideElicitRequestedSchemawas aTypeAliasfordict[str, Any]; it is now a Pydantic model of the spec's restricted requested-schema subset, backed by a newPrimitiveSchemaDefinitionunion.ServerSession.elicit_form()(and the deprecatedelicit()alias) andClientPeer.elicit_form()accept only this model, so a nested-object property, an array-of-objects property, or ananyOfunion is unconstructible at the only place a server author supplies a schema, rather than silently forwarded to the client — the spec restricts form-mode requested schemas to flat objects with primitive-typed properties ("complex nested structures, arrays of objects … are intentionally not supported"). The high-levelContext.elicit()/elicit_with_validation()path is unchanged. Inbound is deliberately untouched:ElicitRequestFormParams.requested_schemastays a plaindict[str, Any]on the wire, so older servers that emitanyOfforOptionalform fields still reach the client's elicitation callback.c2b3e8eeRecord two divergences as intentional, ecosystem-consistent choicesstdio_serverdoes not redirectsys.stdout, so a handlerprint()corrupts the protocol stream — no MCP SDK redirects stdout, and a redirect would only catchprint(), notos.write(1, ...)or C extensions writing to fd 1, so it would be a partial guard rather than a structural fix; the logging tutorial already tells stdio server authors to log to stderr. (2) No MCP SDK validates sender-side progress monotonicity; that MUST is a contract on the handler author, not on the transport, and the test pins the unvalidated pass-through.Motivation
These are all places where the SDK was provably out of step with normative spec text (or an
RFC the spec defers to). Three of them are reported user-facing bugs (#2629, #1307, #2605);
the auth ones matter most — a resource server that accepts audience-unbound tokens, or a
built-in authorization server that registers
javascript:redirect URIs, is the kind ofdefault a user has no way to notice. Working through them as one burn-down keeps the
requirement manifest honest: a gap that is fixed loses its record, a gap that is partially
closed gets a record stating exactly the residual, and a gap that is a deliberate choice
says so — so what remains recorded is real and current.
Tested
./scripts/test(the full suite under the 100% line+branch coverage gate plusstrict-no-cover) is green at HEAD: 4303 passed, 7 skipped, 1 xfailed (the onexfail is pre-existing and unrelated — a Pydantic
AnyUrltrailing-slash quirk intests/client/test_auth.py), coverage 100.00%.pyrightreports 0 errors;ruff check/ruff formatare clean; the new-suppression audit(
git diff ... | grep -E '^\+.*(pragma|type: ignore|noqa)') is empty. Every behaviourchange flips an existing requirement-tagged interaction test rather than only adding a new
one, so the old behaviour cannot silently come back. New tests cover the fail-closed
audience gate (including the
verifier_validates_audienceopt-out and the refresh chain),the loopback/fragment matrix at
/register, the PKCE-refusal arms (absent field, listwithout
S256, no-metadata leniency), the double-initializerejection on stdio, statefulHTTP, and stateless HTTP, the
strict_capabilitiespre-wire rejection alongside thedefault send-and-surface behaviour, the request-scoped progress close on both dispatchers,
and the typed
ElicitRequestedSchemaconstruction/rejection matrix and itsto_wire()round-trip.
Breaking changes
All documented in
docs/migration.md(one section each):that carry no resource indicator at all. Populate
AccessToken.resourcein yourTokenVerifier(recommended; the examples now show it), or setAuthSettings(verifier_validates_audience=True)if your verifier already enforcesaudand cannot surface it.
resource_server_url=Nonestill disables the check./registerrejectsnon-HTTPS, non-loopback, and fragment-carrying redirect URIs with
invalid_client_metadata(this also rejects RFC 8252 private-use schemes such as
com.example.app:/callback; MCPallows only HTTPS or loopback);
/tokenanswers a redirect-URI mismatch withinvalid_grantinstead ofinvalid_request.OAuthFlowErrorinstead of proceeding. No SDK-side opt-out: an authorization server thatsupports S256 but omits the field needs its published metadata fixed (RFC 8414 §2).
scopes_supportedfrom authorization-server metadata tochoose a scope — if that was your only scope source, the client now omits
scopeentirely; pass an explicit
scopeonOAuthClientMetadatato keep the old request.-32603; cancelled requests get no reply — codethat matched on the previous
code: 0responses must update; raiseMCPErrorfor aspecific code, and note the exception text is now logged, not sent.
elicitation/createreturns-32602; unhandledroots/listreturns-32601— server code that branched onerror.code == -32600to detect a clientwithout these should switch codes, or (better) check the client's declared capabilities
before sending.
initializeon an already-initialized session is rejected with-32600("Session already initialized"). A peer that needs a fresh handshake opens a new
connection (on streamable HTTP, a
POSTwithoutMcp-Session-Id).ServerSession.elicit_form()(and the deprecatedelicit()alias) andClientPeer.elicit_form()now take a typedmcp_types.ElicitRequestedSchema, not anarbitrary
dict[str, Any]—ElicitRequestedSchemais the model now, no longer aTypeAliasfordict[str, Any]. Build it directly(
ElicitRequestedSchema(properties={"x": StringSchema(type="string")}, required=["x"]))or validate an existing JSON Schema dict with
ElicitRequestedSchema.model_validate(...).Schemas with nested-object properties, array-of-objects properties, or
anyOfunions arerejected at construction instead of being forwarded. The high-level
Context.elicit()/elicit_with_validation()path and the inbound/wire representation are unchanged.ServerSession.send_progress_notification()is deprecated in favour of therequest-scoped
Context.report_progress()/ServerSession.report_progress(). Themethod still works, but every call now emits
mcp.MCPDeprecationWarning— so test suitesthat run with
filterwarnings = ["error"](or-W error) will start failing on it.report_progressuses the inbound request's own token and cannot emit progress after thatrequest has completed, which is what the spec requires.
Not a breaking change, but called out for completeness:
Client(..., strict_capabilities=)(and the same keyword on
ClientSession) is additive and opt-in. The default isFalseand the default behaviour is byte-for-byte unchanged — every request is still sent and the
server's answer is still surfaced.
Closes
Closes #2629
Closes #1307
Closes #2605
AI Disclaimer