Work through the spec-conformance gaps recorded by the interaction suite by maxisbey · Pull Request #2999 · modelcontextprotocol/python-sdk

maxisbey · 2026-06-26T14:26:05Z

The tests/interaction/ suite is requirement-tagged against the MCP specification. Where
the SDK's behaviour fell short of the spec, the suite did not skip or xfail: it pinned the
actual behaviour and recorded the gap on the requirement entry. This PR is a burn-down of
those recorded gaps. It fixes fourteen of them in code (and deletes a fifteenth whose
behaviour had already been fixed but whose record had gone stale), narrows three more to
a precisely-stated residual, records two as deliberate, ecosystem-consistent choices, and
closes three long-standing upstream issues along the way. It spans the server's bearer-token
gate, the bundled OAuth authorization server, the OAuth client's use of discovered metadata,
the JSON-RPC error surface for cancellation / handler crashes / unhandled client callbacks,
the initialize handshake, an opt-in client-side capability pre-check, the explicit-token
progress API, and the elicitation requested-schema type. For every behaviour change, the
existing requirement-tagged test flips from pinning the gap to pinning the spec behaviour, so
the suite stays the source of truth for what conforms and what does not.

Commits

Commit / subject	What it closes, by behaviour
`68e4eb28` Tighten requirements-catalog divergence notes to match current SDK behaviour	Bookkeeping only — no runtime change. Corrects four requirement entries whose recorded-gap notes had gone stale (one behaviour was already fixed; three were re-scoped or re-classified as intentional, cross-SDK-consistent choices).
`bad42e28` Emit RFC 6750 scope= in WWW-Authenticate and validate token audience	The resource server's `401`/`403` challenges now advertise the required scopes via `scope=` (which the SDK client already reads to drive step-up); a request with no credentials gets a bare `Bearer` challenge instead of `error="invalid_token"` (RFC 6750 §3.1 says the `error` attribute SHOULD NOT appear when nothing was presented); and a token whose RFC 8707 audience names a different server is now rejected `401 invalid_token`.
`47639a2b` Stop replying to cancelled requests; map unhandled handler exceptions to -32603	A request cancelled via `notifications/cancelled` now gets no response, per the spec's SHOULD (it was answered with `code: 0`). An unhandled exception in a request handler now produces JSON-RPC `-32603` (`INTERNAL_ERROR`) with an opaque message instead of `code: 0` carrying `str(exc)` — handler internals no longer leak to the peer; the exception is still logged server-side. `raise MCPError(...)` for a specific code is unchanged.
`a1734460` Reject bearer tokens that carry no audience claim	Closes the hole the previous commit left: a verified token with no resource indicator at all was still accepted. When `resource_server_url` is configured the gate now fails closed (`401 invalid_token`). `AuthSettings(verifier_validates_audience=True)` opts out for verifiers that enforce `aud` themselves and cannot surface the claim; `RefreshToken.resource` is added so the audience binding survives a refresh. The SDK's own tutorials and example servers were teaching the bug and are fixed in the same commit.
`884badc9` Use spec error codes for unhandled elicitation/create and roots/list	When a client has no `elicitation_callback` / `list_roots_callback`, the SDK still answers the server on its behalf. Those answers now use the codes the spec assigns — `-32602` for `elicitation/create` (a client with no callback declared no modes, and a request for an undeclared mode MUST be answered `-32602`) and `-32601` for `roots/list` — instead of `-32600` for both. Messages unchanged; sampling stays at `-32600` (the spec assigns no code there).
`24061fe9` Reject non-HTTPS, non-loopback redirect URIs at client registration	The bundled authorization server's `/register` accepted any well-formed URL as a `redirect_uris` entry — cleartext `http://` on a non-loopback host, `javascript:`, `data:`, fragment-carrying URIs. It now rejects anything that is not HTTPS or a loopback host (`localhost`, `127.0.0.1`, `[::1]`) with `400 invalid_client_metadata`, and rejects fragments (OAuth 2.1 §2.3). Separately, `/token` now answers an authorization-code exchange whose `redirect_uri` does not match the one used at `/authorize` with `error=invalid_grant` (RFC 6749 §5.2) instead of `invalid_request`. Closes #2629.
`3eb352c8` Align OAuth client with spec on PKCE verification and scope selection	(a) The client now verifies PKCE support before starting an authorization-code grant, as the spec's Authorization Code Protection section requires: if AS metadata was discovered and its `code_challenge_methods_supported` is absent or omits `S256`, the flow raises `OAuthFlowError` instead of redirecting. (b) The client stops reading the authorization server's `scopes_supported` as a scope fallback — that list is a superset of any one resource's, so the SDK over-requested and triggered real `access_denied` failures. The spec's chain is `WWW-Authenticate scope=` → PRM `scopes_supported` → omit. Closes #1307.
`1e2bd9be` Reject a second initialize on an already-initialized session	A repeated `initialize` on a live connection was answered as a fresh handshake, silently overwriting the session's recorded `client_params` and negotiated protocol version — so a later `check_capability` answered against the second client's capabilities. It is now rejected `-32600` ("Session already initialized") and the established session keeps serving. No compliant client is affected (`ClientSession.initialize()` is already idempotent). Legacy-only: 2026-07-28 removes `initialize`. Closes #2605.
`8f60ebe5` Make the prompt test's opaque coverage workaround explicit	Test-only: a two-line comment explaining a deliberately non-obvious shape in one prompt test, so the next reader does not "fix" it into a coverage violation.
`28f500c7` Add an opt-in strict_capabilities flag to Client and ClientSession	Additive, default-off. `Client(..., strict_capabilities=True)` rejects, before the request reaches the transport, a call to a method whose required server capability the connected server did not advertise — e.g. `list_resources()` against a server that only advertised `tools`, or `subscribe_resource()` when the server's `resources` capability does not set `subscribe`. The rejection is an `MCPError` with `-32601` (`METHOD_NOT_FOUND`), the same code a compliant server returns for an unadvertised capability, so opting in changes where the rejection happens, not what callers catch. Mirrors the TypeScript SDK's `enforceStrictCapabilities` (also default-off); the same keyword-only parameter exists on `ClientSession`. Because the gate reads the negotiated server capabilities, combining it with a bare version pin (where the client never learns them) is refused at construction with a `ValueError` that names the fix. The method-to-capability table is exported as `mcp_types.methods.SERVER_CAPABILITY_REQUIREMENTS`, and the lifecycle capability-rule requirement entry is no longer marked untested.
`eddfa29d` Deprecate ServerSession.send_progress_notification	`send_progress_notification` takes an explicit progress token decoupled from any request's lifetime, so it can keep emitting progress for a request that has already completed — which the spec forbids ("Progress notifications MUST stop after completion"). It is deprecated in favour of the request-scoped `Context.report_progress()` / `ServerSession.report_progress()`, which reports against the inbound request's own token, no-ops when the caller did not ask for progress, and is closed with the request. The deprecated method keeps working and emits `MCPDeprecationWarning`. To make "stops when the request completes" hold on every dispatcher, the in-process direct dispatcher now closes its context with the request the way the JSON-RPC one already did.
`3605ec09` Type the elicitation requested schema on the send side	`ElicitRequestedSchema` was a `TypeAlias` for `dict[str, Any]`; it is now a Pydantic model of the spec's restricted requested-schema subset, backed by a new `PrimitiveSchemaDefinition` union. `ServerSession.elicit_form()` (and the deprecated `elicit()` alias) and `ClientPeer.elicit_form()` accept only this model, so a nested-object property, an array-of-objects property, or an `anyOf` union is unconstructible at the only place a server author supplies a schema, rather than silently forwarded to the client — the spec restricts form-mode requested schemas to flat objects with primitive-typed properties ("complex nested structures, arrays of objects … are intentionally not supported"). The high-level `Context.elicit()` / `elicit_with_validation()` path is unchanged. Inbound is deliberately untouched: `ElicitRequestFormParams.requested_schema` stays a plain `dict[str, Any]` on the wire, so older servers that emit `anyOf` for `Optional` form fields still reach the client's elicitation callback.
`c2b3e8ee` Record two divergences as intentional, ecosystem-consistent choices	Bookkeeping only — no code or test behaviour change. Two interaction-suite divergence notes described their gaps as unenforced spec MUSTs without saying whether closing them was planned; both are deliberate and the notes now say so. (1) `stdio_server` does not redirect `sys.stdout`, so a handler `print()` corrupts the protocol stream — no MCP SDK redirects stdout, and a redirect would only catch `print()`, not `os.write(1, ...)` or C extensions writing to fd 1, so it would be a partial guard rather than a structural fix; the logging tutorial already tells stdio server authors to log to stderr. (2) No MCP SDK validates sender-side progress monotonicity; that MUST is a contract on the handler author, not on the transport, and the test pins the unvalidated pass-through.

Motivation

These are all places where the SDK was provably out of step with normative spec text (or an
RFC the spec defers to). Three of them are reported user-facing bugs (#2629, #1307, #2605);
the auth ones matter most — a resource server that accepts audience-unbound tokens, or a
built-in authorization server that registers javascript: redirect URIs, is the kind of
default a user has no way to notice. Working through them as one burn-down keeps the
requirement manifest honest: a gap that is fixed loses its record, a gap that is partially
closed gets a record stating exactly the residual, and a gap that is a deliberate choice
says so — so what remains recorded is real and current.

Tested

./scripts/test (the full suite under the 100% line+branch coverage gate plus
strict-no-cover) is green at HEAD: 4303 passed, 7 skipped, 1 xfailed (the one
xfail is pre-existing and unrelated — a Pydantic AnyUrl trailing-slash quirk in
tests/client/test_auth.py), coverage 100.00%. pyright reports 0 errors;
ruff check / ruff format are clean; the new-suppression audit
(git diff ... | grep -E '^\+.*(pragma|type: ignore|noqa)') is empty. Every behaviour
change flips an existing requirement-tagged interaction test rather than only adding a new
one, so the old behaviour cannot silently come back. New tests cover the fail-closed
audience gate (including the verifier_validates_audience opt-out and the refresh chain),
the loopback/fragment matrix at /register, the PKCE-refusal arms (absent field, list
without S256, no-metadata leniency), the double-initialize rejection on stdio, stateful
HTTP, and stateless HTTP, the strict_capabilities pre-wire rejection alongside the
default send-and-surface behaviour, the request-scoped progress close on both dispatchers,
and the typed ElicitRequestedSchema construction/rejection matrix and its to_wire()
round-trip.

Breaking changes

All documented in docs/migration.md (one section each):

Bearer tokens are rejected unless their audience names this server — including tokens
that carry no resource indicator at all. Populate AccessToken.resource in your
TokenVerifier (recommended; the examples now show it), or set
AuthSettings(verifier_validates_audience=True) if your verifier already enforces aud
and cannot surface it. resource_server_url=None still disables the check.
Bundled authorization server: RFC-correct redirect-URI handling — /register rejects
non-HTTPS, non-loopback, and fragment-carrying redirect URIs with invalid_client_metadata
(this also rejects RFC 8252 private-use schemes such as com.example.app:/callback; MCP
allows only HTTPS or loopback); /token answers a redirect-URI mismatch with
invalid_grant instead of invalid_request.
OAuth client refuses to authorize when AS metadata does not advertise S256 PKCE —
OAuthFlowError instead of proceeding. No SDK-side opt-out: an authorization server that
supports S256 but omits the field needs its published metadata fixed (RFC 8414 §2).
OAuth client no longer reads scopes_supported from authorization-server metadata to
choose a scope — if that was your only scope source, the client now omits scope
entirely; pass an explicit scope on OAuthClientMetadata to keep the old request.
Unhandled handler exceptions return -32603; cancelled requests get no reply — code
that matched on the previous code: 0 responses must update; raise MCPError for a
specific code, and note the exception text is now logged, not sent.
Unhandled elicitation/create returns -32602; unhandled roots/list returns
-32601 — server code that branched on error.code == -32600 to detect a client
without these should switch codes, or (better) check the client's declared capabilities
before sending.
A second initialize on an already-initialized session is rejected with -32600
("Session already initialized"). A peer that needs a fresh handshake opens a new
connection (on streamable HTTP, a POST without Mcp-Session-Id).
ServerSession.elicit_form() (and the deprecated elicit() alias) and
ClientPeer.elicit_form() now take a typed mcp_types.ElicitRequestedSchema, not an
arbitrary dict[str, Any] — ElicitRequestedSchema is the model now, no longer a
TypeAlias for dict[str, Any]. Build it directly
(ElicitRequestedSchema(properties={"x": StringSchema(type="string")}, required=["x"]))
or validate an existing JSON Schema dict with ElicitRequestedSchema.model_validate(...).
Schemas with nested-object properties, array-of-objects properties, or anyOf unions are
rejected at construction instead of being forwarded. The high-level Context.elicit() /
elicit_with_validation() path and the inbound/wire representation are unchanged.
ServerSession.send_progress_notification() is deprecated in favour of the
request-scoped Context.report_progress() / ServerSession.report_progress(). The
method still works, but every call now emits mcp.MCPDeprecationWarning — so test suites
that run with filterwarnings = ["error"] (or -W error) will start failing on it.
report_progress uses the inbound request's own token and cannot emit progress after that
request has completed, which is what the spec requires.

Not a breaking change, but called out for completeness: Client(..., strict_capabilities=)
(and the same keyword on ClientSession) is additive and opt-in. The default is False
and the default behaviour is byte-for-byte unchanged — every request is still sent and the
server's answer is still surfaced.

Closes

Closes #2629
Closes #1307
Closes #2605

_{AI Disclaimer}

cubic-dev-ai

2 issues found across 29 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/mcp/shared/auth_utils.py">

<violation number="1" location="src/mcp/shared/auth_utils.py:32">
P2: Audience canonicalization can throw `ValueError` on malformed resource ports, breaking auth flow with a server error.</violation>

<violation number="2" location="src/mcp/shared/auth_utils.py:83">
P2: Audience validation is over-permissive because trailing slashes are stripped before comparison, so different resource URIs like `/api` and `/api/` are treated as equivalent.</violation>
</file>

_{Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic}

cubic-dev-ai · 2026-06-26T15:47:27Z

+    origin is NOT for this server. Contrast check_resource_allowed, which is the
+    client-side hierarchical question and intentionally more permissive.
+    """
+    return resource_url_from_server_url(token_resource).rstrip("/") == resource_url_from_server_url(


P2: Audience validation is over-permissive because trailing slashes are stripped before comparison, so different resource URIs like /api and /api/ are treated as equivalent.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At src/mcp/shared/auth_utils.py, line 83: <comment>Audience validation is over-permissive because trailing slashes are stripped before comparison, so different resource URIs like `/api` and `/api/` are treated as equivalent.</comment> <file context> @@ -65,6 +72,19 @@ def check_resource_allowed(requested_resource: str, configured_resource: str) -> + origin is NOT for this server. Contrast check_resource_allowed, which is the + client-side hierarchical question and intentionally more permissive. + """ + return resource_url_from_server_url(token_resource).rstrip("/") == resource_url_from_server_url( + server_resource + ).rstrip("/") </file context>

You are right at the RFC 3986 layer — strict equivalence distinguishes /api/ from /api. But RFC 8707 defines no comparison algorithm, and its own examples use trailing-slash resource URIs. Two things make the normalization load-bearing: pydantic's AnyHttpUrl renders a root resource as https://h/ while the spec's example token request sends the slashless form — strict comparison would 401 every conformant client of a root-path deployment. And the compared value is the verifier-asserted audience from the validated token, not raw client input; exploiting the leniency would need two distinct resource servers at one origin differing only by a trailing slash, which a Starlette mount cannot host. Keeping the rstrip; c53aefd adds a comment plus tests pinning the equivalence and the sibling/child-path rejections.

_{AI Disclaimer}

The parent comment was wrong here: this audience comparison is intentionally load-bearing for RFC 8707 resource validation. The root-path and verifier-asserted-audience cases make the trailing-slash normalization necessary, so keeping the rstrip('/') is correct.

cubic-dev-ai

1 issue found across 36 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/mcp/shared/auth_utils.py">

<violation number="1" location="src/mcp/shared/auth_utils.py:83">
P2: Audience validation is over-permissive because trailing slashes are stripped before comparison, so different resource URIs like `/api` and `/api/` are treated as equivalent.</violation>
</file>

_{Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic}

…haviour - resources:annotations: drop the stale lastModified divergence; the model now carries the field and it round-trips. - lifecycle:capability:client-not-declared: narrow to the one remaining gap (the deprecated send_roots_list_changed path); the handler half is correct by construction. - lifecycle:pre-initialization-ordering: mark removed_in=2026-07-28; the initialize handshake is gone in the new spec. - client-transport:http:session-404-reinitialize: reword as an intentional cross-SDK choice (404 surfaced to caller), not a missed MUST. - test_resources: drop the now-redundant lastModified xfail.

BearerAuthBackend / RequireAuthMiddleware now produce spec-conformant challenges and reject tokens issued for a different resource server. - A request with no credentials gets a bare `Bearer` challenge (with scope/resource_metadata only), not error="invalid_token" -- RFC 6750 Section 3.1 says the error attribute SHOULD NOT appear when no authentication information was presented. - A malformed/unknown token, an expired token, or a token whose audience does not match the configured resource_server_url is answered 401 invalid_token with a specific error_description, carried via a new InvalidTokenUser marker so the middleware can distinguish it from no-credentials. - All challenges (401 and the 403 insufficient_scope path) now advertise the required scopes in a `scope=` parameter, which the SDK client already reads to drive step-up. - New check_token_audience() helper canonicalises default ports before comparing, and is wired through both the lowlevel and MCPServer Starlette stacks via the auth settings' resource_server_url. Docs and migration guide updated; the corresponding interaction-suite divergence entries are now closed.

… to -32603 A request cancelled via notifications/cancelled now gets no response: when the handler scope's cancel is caught, JSONRPCDispatcher returns instead of writing an error. The sender retired its own waiter when it cancelled, so no reply is needed to unblock it. An unhandled exception in a request handler now produces JSON-RPC error -32603 (INTERNAL_ERROR) with the opaque message "Internal server error" instead of code 0 carrying str(exc). The exception is still logged server-side. To send a specific code/message, raise MCPError; pydantic ValidationError still maps to INVALID_PARAMS. handler_exception_to_error_data is now total: it returns MappedError(error: ErrorData, unexpected: bool) for every Exception. Callers gate logger.exception / raise_handler_exceptions on the unexpected flag rather than re-deriving the rung set, so a handler that deliberately raises MCPError(code=INTERNAL_ERROR) is not treated as a crash. JSONRPCDispatcher, DirectDispatcher, and the modern HTTP entry's _to_jsonrpc_response all call the one helper; DirectDispatcher no longer hand-rolls its own ladder. The interaction suite's protocol:cancel:in-flight and protocol:error:internal-error requirements drop their divergence entries and modern-error-surface arm exclusions; the two prompt-validation divergence notes are reworded for the new -32603 surface. docs/migration.md gains a section covering both behaviour changes.

When `AuthSettings.resource_server_url` is configured, `BearerAuthBackend` previously ran the RFC 8707 audience comparison only for tokens whose verifier populated `AccessToken.resource`: a token carrying no resource indicator at all was accepted. The MCP authorization spec requires a resource server to only accept tokens issued specifically for it, so the gate now fails closed: a verified token with no `resource` is answered `401 invalid_token` ("The access token carries no audience claim"). `resource_server_url=None` still means there is no audience to enforce. For verifiers that validate the audience themselves and cannot surface the claim (for example a JWT decoder configured with the expected audience), the new `AuthSettings.verifier_validates_audience=True` opts the gate out. The `AuthSettings.enforced_audience` property derives the single value both server wirings pass to `BearerAuthBackend`, whose signature is unchanged. `RefreshToken` gains an optional `resource` field so an authorization server provider can carry the original grant's audience binding through `exchange_refresh_token`; without it every refreshed access token would be audience-unbound and rejected by the hardened gate. The docs tutorials and example servers now populate `AccessToken.resource` (and the client-credentials demo token endpoint honors the RFC 8707 `resource` parameter) so they pass the check they teach. The migration guide entry for audience validation is rewritten for the fail-closed behavior.

A client constructed without an elicitation_callback or list_roots_callback still answers the server's request, via a default callback that returns a JSON-RPC error. Both defaults used -32600 (invalid request). The spec assigns a specific code to each case: - elicitation/create: -32602 (invalid params). A client with no callback declares no elicitation modes, so every incoming request names an undeclared mode, which clients MUST answer with -32602. - roots/list: -32601 (method not found), the code clients SHOULD use when they do not support roots. The default sampling callback keeps -32600: the spec assigns no code to a client that does not support sampling. Error messages are unchanged. Update the affected tests and the interaction-requirement entries that recorded the old codes, fix the code named in the client callbacks doc page, and add a migration note.

Two fixes to the optional bundled OAuth authorization server (the `auth_server_provider=` path). The registration endpoint accepted any well-formed URL as a `redirect_uris` entry: cleartext `http://` on a non-loopback host, `javascript:`, `data:`, and URIs carrying a fragment all registered successfully. The MCP authorization specification's Communication Security section requires every redirect URI to be either localhost or HTTPS, and OAuth 2.1 section 2.3 forbids a fragment component. Such an entry is now rejected with `400 invalid_client_metadata`. Loopback is exactly the three forms OAuth 2.1 section 8.4.2 names (`localhost`, `127.0.0.1`, `[::1]`), on any port; query strings remain permitted. This also rejects RFC 8252 private-use schemes such as `com.example.app:/callback`: MCP restricts redirect URIs to HTTPS or loopback, with no carve-out for native apps. The rule lives on the request model: `RegistrationRequest`, until now a dead alias of `OAuthClientMetadata`, becomes a real subclass with a `redirect_uris` field validator, so a forbidden URI fails parsing and takes the handler's existing `invalid_client_metadata` arm rather than needing a post-parse check. `OAuthClientMetadata` itself is unchanged: the client also serializes it when registering against third-party authorization servers whose redirect-URI policies the SDK does not own. The loopback host set moves to a single `LOOPBACK_HOSTS` constant in `mcp.server.auth.provider`, shared with `validate_issuer_url`, which previously inlined the same tuple. Separately, the token endpoint now answers an authorization-code exchange whose `redirect_uri` does not match the one used at `/authorize` with `error=invalid_grant` instead of `invalid_request`. RFC 6749 section 5.2 assigns this case to `invalid_grant` ("does not match the redirection URI used in the authorization request"), and the handler's other authorization-code failures already use it. The exchange was already rejected with HTTP 400; only the `error` field changes. Update the affected tests and the interaction-requirement entries, and add a migration note. Closes #2629

Two changes to how the OAuth client uses discovered authorization-server metadata, both required by the MCP authorization specification. Verify PKCE support before the authorization-code grant. The spec's Authorization Code Protection section requires clients to verify PKCE support from the authorization server's metadata and to refuse to proceed when code_challenge_methods_supported is absent. The new validate_pkce_support() also refuses a method list that omits S256, since that is the only method this client sends. The check sits at the top of _perform_authorization_code_grant, so it covers both the initial 401 flow and the 403 insufficient_scope step-up, while grants that never issue an authorization code (client credentials, private key JWT) are unaffected. When no metadata document was discovered at all the flow proceeds as before: absence of a document is not evidence of non-support. Stop reading scopes_supported from authorization-server metadata when selecting a scope. The spec's scope-selection chain is the WWW-Authenticate scope parameter, then the protected-resource metadata's scopes_supported, otherwise omit the scope parameter. The SDK inserted an extra fallback to the authorization server's scopes_supported, which over-requests (an authorization server may serve many resource servers, so its list is a superset of any one resource's) and causes access_denied failures against servers that reject unknown scopes. With the fallback removed, clients that relied on it should pass an explicit scope on their OAuthClientMetadata. Closes #1307

A server that had already completed the initialize handshake on a connection answered a repeated initialize request on that same connection as a fresh handshake, silently overwriting the session's recorded client_params and negotiated protocol version. A check_capability call made after that point then answered against the second client's declared capabilities. The handshake now commits at most once per connection: a repeated initialize is answered with JSON-RPC error -32600 (INVALID_REQUEST, "Session already initialized") and the established session keeps serving. The check lives at the runner's request-dispatch boundary, right next to its mirror (the request-before-initialize gate), so every transport gets it without any transport-level body sniffing. The discriminator is client_params rather than initialize_accepted: the legacy stateless path builds a per-request connection that is born past the gate but has no peer info yet, and its one initialize must still be accepted. No compliant client is affected. The spec makes initialization the first interaction of a session, and ClientSession.initialize() is already idempotent (a repeat call returns the first result without sending anything). This only applies to the legacy (2025-11-25 and earlier) handshake; the 2026-07-28 protocol removes initialize entirely. Closes #2605

The never-called prompt body in the wrong-type-argument test proves it never ran by appending to a closure-captured list, but a plain append on an unreachable line would fail the 100% coverage gate. The append therefore rides on a `raise NotImplementedError` line, which coverage's `exclude_also` strips. That shape read as an accident; add a comment spelling out why it is written that way.

Client(..., strict_capabilities=True) rejects, before any request reaches the transport, a call to a method whose required server capability the connected server did not advertise -- for example list_resources() against a server that only advertised tools, or subscribe_resource() when the server's resources capability does not set subscribe. The rejection is an MCPError with code -32601 (METHOD_NOT_FOUND) and data set to the method, the same shape a compliant server returns for an unadvertised capability, so opting in changes where the rejection happens, not what callers catch. The default is False and unchanged: every request is sent and the server's answer is surfaced. This mirrors the TypeScript SDK's enforceStrictCapabilities option (also default-off). The same keyword-only parameter exists on ClientSession for low-level users; Client forwards it. The method-to-capability table lives in mcp_types.methods.SERVER_CAPABILITY_REQUIREMENTS next to the other per-method maps, with missing_server_capability() as its only evaluator, so the check in ClientSession.send_request is a single data-driven gate rather than a per-method condition, and the relationship is the same at every protocol version. Because the gate reads server_capabilities, a bare version pin (mode="2026-07-28" with no prior_discover=) would reject every gated method; that combination is refused at Client construction with a ValueError that names the fix. The interaction-requirements entry for the lifecycle capability rule is no longer marked untested: the new tests pin both the opt-in pre-wire rejection and the default send-and-surface behaviour.

`ServerSession.send_progress_notification` takes an explicit progress token decoupled from the request it belongs to, so it can keep emitting progress for a request that has already completed -- which the spec forbids ("Progress notifications MUST stop after completion"). The request-scoped `report_progress` (and `Context.report_progress`) is the supported path: it reports against the inbound request's own token, no-ops when the caller did not ask for progress, and stops when the request completes. The deprecated method keeps working and emits `MCPDeprecationWarning`. The warning message deliberately departs from the "<X> is deprecated as of <version>" pattern used by the spec-driven deprecations: this one is an SDK API decision, not a spec retirement (2026-07-28 does not retire server-to-client progress). For "stops when the request completes" to hold on every dispatcher, `_DirectDispatchContext` now closes with its request the way `_JSONRPCDispatchContext` already did: `close()` runs in the dispatch handler's `finally`, after which `progress`/`notify` deliver nothing, `can_send_request` is False, and `send_raw_request` raises `NoBackChannelError` -- the closed state the `DispatchContext` protocol documents. Two pre-existing tests that asserted `can_send_request` on a context captured after its handler returned now sample it in-handler, and the closed-state contract tests are parametrized over both dispatchers. The interaction test that covered both the server and client side of late progress is split in two. The server-side property is proved positively on the wire through `report_progress`; it no longer relies on a session-bound standalone stream, so it also runs on the stateless streamable-http arm. The client-side late-drop test keeps using the deprecated explicit-token method -- the only API that can still produce a late notification -- under `pytest.warns`, and its arms are unchanged. The migration guide stops recommending the deprecated method anywhere and documents the replacement.

ElicitRequestedSchema was a TypeAlias for dict[str, Any]; it is now a Pydantic model of the spec's restricted requested-schema subset, backed by a new PrimitiveSchemaDefinition union (StringSchema, NumberSchema, BooleanSchema, and the enum schemas). ServerSession.elicit_form (and the deprecated elicit alias) and ClientPeer.elicit_form accept only this model, so a nested-object property, an array-of-objects property, or an anyOf union is unconstructible at the only place a server author supplies a schema, rather than silently forwarded to the client. The spec restricts form-mode requested schemas to flat objects with primitive-typed properties only ("complex nested structures, arrays of objects ... are intentionally not supported"). The high-level Context.elicit / elicit_with_validation path is unchanged in behaviour: it converts the rendered JSON Schema into the typed model, keeping its existing per-field TypeError contract and producing value-identical wire output. Inbound is deliberately untouched. The wire field ElicitRequestFormParams.requested_schema stays a plain dict[str, Any], so older servers that emit anyOf for Optional form fields still reach the client's elicitation callback. The typed-model-to-wire-dict conversion lives in one place, ElicitRequestedSchema.to_wire(), which both send sites call. The schema family is extra="allow": keys schema.ts does not name (a top-level title, pattern, exclusiveMinimum, json_schema_extra keys) still round-trip, because the primitives-only restriction is carried by the union members' required type literals, not by extra-key rejection.

Two interaction-suite divergence notes described their gaps as unenforced spec MUSTs without saying whether closing them was planned. Both are deliberate, and the notes now say so. transport:stdio:stream-purity -- stdio_server does not redirect sys.stdout, so a handler print() corrupts the protocol stream. No MCP SDK redirects stdout, and a redirect would only catch print(), not os.write(1, ...) or C extensions writing to file descriptor 1, so it would be a partial guard rather than a structural fix. docs/tutorial/logging.md already tells server authors to log to stderr and never print() in a stdio server, so no docs change was needed. protocol:progress:monotonic -- no MCP SDK validates sender-side progress monotonicity; the spec MUST is a contract on the handler author, not on the transport, and the test pins the unvalidated pass-through. No code or test behaviour changes.

Review-feedback round on the conformance burn-down: - Cancelled requests no longer leave the legacy streamable-HTTP POST hanging. The dispatcher emits a RequestSettled marker when a handler is cancelled without producing a response; the transport consumes it by closing the per-request stream, so the POST's SSE stream terminates without a response frame and JSON-response mode completes with 204 No Content (the client treats 202/204 alike). Per-request streams are released instead of leaking until session teardown, and a handler that survives the cancellation still delivers its normal response. The marker is type-visible on the dispatcher write stream and is stripped by every serializing transport, so it can never appear on a wire. - A bearer token whose audience cannot be canonicalized (out-of-range or non-numeric port) is now rejected with the standard 401 invalid_token instead of raising through the auth middleware as a 500. - The bundled authorization server's /register now accepts only https redirect URIs or http on a loopback host; other schemes on loopback hosts (ftp, ws, javascript, custom) are rejected. - OAuth client scope selection falls back to the caller-configured OAuthClientMetadata.scope when neither the WWW-Authenticate challenge nor protected-resource metadata names scopes, matching the TypeScript SDK, so the documented migration path works as written. - The cross-dispatcher contract that handler-raised MCPError subclasses surface to callers as plain MCPError is now pinned by an explicit test and documented; rehydrate with from_error when the subclass matters. - Docs: migration notes for the bearer-challenge wire-shape changes and the cancellation wire spellings; story READMEs updated to the landed error contract; strict-capabilities doc corrected to state that resources/unsubscribe is gated by the base resources capability only.

cubic-dev-ai

1 issue found across 46 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/mcp/shared/direct_dispatcher.py">

<violation number="1" location="src/mcp/shared/direct_dispatcher.py:111">
P3: This overstates the DirectDispatcher contract: notification handlers are not flattened to plain MCPError. Narrow the doc to request dispatch paths, or update notification handling separately.</violation>
</file>

_{Tip: Review your code locally with the cubic CLI to iterate faster.

Fix all with cubic | Re-trigger cubic}

cubic-dev-ai · 2026-06-28T11:38:37Z

    handler. Notifications are fire-and-forget in both directions: after close
    they are silently dropped.
+
+    Handler-raised `MCPError` subclasses flatten to plain `MCPError` with equal


P3: This overstates the DirectDispatcher contract: notification handlers are not flattened to plain MCPError. Narrow the doc to request dispatch paths, or update notification handling separately.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At src/mcp/shared/direct_dispatcher.py, line 111: <comment>This overstates the DirectDispatcher contract: notification handlers are not flattened to plain MCPError. Narrow the doc to request dispatch paths, or update notification handling separately.</comment> <file context> @@ -107,6 +107,11 @@ class DirectDispatcher: handler. Notifications are fire-and-forget in both directions: after close they are silently dropped. + + Handler-raised `MCPError` subclasses flatten to plain `MCPError` with equal + `ErrorData` on every dispatch path, matching the wire, where subclass + identity cannot survive; callers needing the subclass rehydrate it from </file context>

claude · 2026-06-28T12:05:07Z

+            if response.status_code in (202, 204):
+                # 202: notification/response accepted. 204: the server settled the
+                # request with no reply (peer-cancelled); nothing to deliver — the
+                # waiter was retired at cancel time.
+                logger.debug(f"Received {response.status_code}")
                return



🔴 The client now silently swallows any 204 on a request POST (src/mcp/client/streamable_http.py:268-274) on the assumption the waiter was already retired at cancel time, but the server's JSON-response mode also sends 204 when the per-request stream closes for non-cancel reasons (session teardown mid-request, idle-timeout eviction, operator DELETE), and a non-SDK server may answer a request POST with 204 outright — in those cases the client's waiter is still pending and nothing is ever delivered, so with the default read_timeout_seconds=None the caller hangs forever where it previously got a fail-fast error. Consider only swallowing the 204 when the request id is no longer pending, and otherwise retiring the waiter with a synthesized error (a synthesized error for an already-retired cancel waiter is dropped harmlessly).

Extended reasoning...

The bug. _handle_post_request in src/mcp/client/streamable_http.py (lines 268–274) now treats 202 and 204 identically: it logs and returns, with the comment that for a 204 "the waiter was retired at cancel time." That justification only holds when this client initiated the cancellation (its send_raw_request finally retires the pending waiter before the courtesy notifications/cancelled goes out). A 204 has other producers for which the waiter is very much still pending, and for those the silent return means nothing — no response, no error — is ever delivered to the dispatcher for that request id.

The non-cancel 204 producers. On the SDK server side, the JSON-response-mode wait loop in src/mcp/server/streamable_http.py (~lines 564–587) sends 204 whenever the per-request stream closes without a JSONRPCResponse/JSONRPCError frame — and its own comment names two causes: "peer-cancelled … or the session tore down mid-request." terminate() (and connect()'s cleanup) closes the send side of every in-flight per-request stream without routing any reply, so a server-initiated session termination — StreamableHTTPSessionManager's session_idle_timeout evicting a session whose handler outlives the idle deadline, an operator/external DELETE on the session id, or session-manager shutdown — while a client request is in flight produces exactly this 204. Separately, a non-SDK server may (non-compliantly) answer a request POST with 204 directly. In none of these cases did the client cancel anything.

Why nothing else saves the caller. The dispatcher only fails pending waiters when the ClientSession itself closes; the client's read stream stays open after the swallowed 204, and session death is only discovered by later requests. Client and ClientSession both default read_timeout_seconds=None, so the original caller's send_request blocks indefinitely.

Pre-PR vs post-PR (the regression). Before this PR, the same server condition produced a 500 ("No response message received before stream closed"), which the client's ≥400 path mapped to an INTERNAL_ERROR JSONRPCError delivered to the waiter — fail fast. A non-SDK 204 to a request POST hit the unexpected-content-type/parse path and likewise failed the waiter. Both paths now regress from "caller gets an error" to "caller hangs forever." The new test coverage only exercises the cancel flow (where the waiter is already retired), so the regression is not pinned by the suite.

Step-by-step proof. (1) Server runs streamable HTTP with json_response=True and an idle timeout; client issues tools/call (id 1) whose handler runs longer than the idle deadline. (2) The session manager's idle deadline fires (it is only pushed forward by inbound requests), cancels the serve loop, and calls terminate(), which closes every per-request stream; the parked JSON wait loop exits with response_message=None and the POST completes 204. (3) The client's _handle_post_request sees 204, logs "Received 204", and returns — the waiter for id 1 was never retired because the client never cancelled. (4) Nothing else ever resolves that waiter; with the default read_timeout_seconds=None the caller's await client.call_tool(...) blocks forever. Pre-PR, step (2) yielded a 500 and step (3) delivered an INTERNAL_ERROR to the waiter immediately.

Mitigating factors. json_response=True is non-default; the default SSE mode already had an analogous silent-hang shape pre-PR (stream ends with no event id → task returns); and the mainline cancel flow handled by this PR is correct and requirement-pinned (client-transport:http:204-settled-exchange). So this is a narrow trigger — but it is a hang-vs-error regression introduced by this PR, and an indefinite hang is strictly worse than the prior error surface.

How to fix. Only swallow the 204 when the request id is no longer pending in the dispatcher (i.e. this client cancelled it), and otherwise retire the waiter by writing a synthesized JSONRPCError for message.id to read_stream_writer (e.g. INTERNAL_ERROR, "Request settled by server with no response"). In the legitimate cancel case the synthesized error is harmless — the waiter is already retired and late responses for unknown ids are dropped — so unconditionally synthesizing on 204 for request POSTs would also restore fail-fast without breaking the cancel flow.

claude · 2026-06-28T12:05:07Z

            async def message_router():
                try:
                    async for session_message in write_stream_reader:  # pragma: no branch
+                        if isinstance(session_message, RequestSettled):
+                            # The dispatcher ended this request with no reply (peer-cancelled).
+                            # Close the per-POST stream's send side: the SSE writer / JSON wait
+                            # loop finish their iteration and clean up in their own finally.
+                            # Nothing is stored or sent on the wire. Keyed by str(id), matching
+                            # the registration at POST time.
+                            if (streams := self._request_streams.get(str(session_message.request_id))) is not None:
+                                streams[0].close()
+                            continue


🔴 With resumability (an event_store) enabled, a cancelled or timed-out request re-creates the hang/leak the new RequestSettled marker was added to fix: the priming event gives the client a last_event_id, so when the settled POST stream ends with no response it reconnects with Last-Event-ID, and _replay_events (having nothing to replay and nothing recorded for the settled id) re-registers a fresh _request_streams entry and parks forever — leaving one permanently open GET, a dead per-request stream on the server, a parked client transport task, and an extra priming row in the event store per cancellation. Pre-PR the stored code-0 'Request cancelled' error terminated both sides even with resumability, so this is a regression on that arm; a fix is to record a terminator (or drop the stream) in the event store for a settled id, have _replay_events end the GET instead of re-registering an id with no live request, and/or have the client skip reconnection when the only event seen was the priming event for an already-retired waiter.

Extended reasoning...

The bug. The new RequestSettled marker (emitted by the dispatcher when a peer cancellation interrupts a handler) is consumed in message_router (src/mcp/server/streamable_http.py:978-989): it closes the per-POST stream's send side and deliberately stores nothing in the event store ('Nothing is stored or sent on the wire'). That cleanly ends the exchange when no event store is configured — which is exactly the case the client's new no-reconnect early-out covers (src/mcp/client/streamable_http.py:383-388, whose own comment says 'No priming event (no event store) means no id'). With an event store configured, the same cancellation re-creates the hang the marker was added to fix, plus a per-cancellation server-side leak.

The code path. (1) A 2025-11-25+ client POSTs tools/call (id=N) to a server constructed with an event_store. _handle_post_request mints a priming event with an SSE id (_mint_priming_event, src/mcp/server/streamable_http.py:285-301) and _run_sse_writer sends it first; the client records it as last_event_id (_handle_sse_response, src/mcp/client/streamable_http.py:362-363; _handle_sse_event treats the empty-data priming event as non-complete). (2) The caller times out or scope-cancels; the dispatcher sends the courtesy notifications/cancelled, the handler is interrupted, and the dispatcher emits RequestSettled. message_router closes _request_streams[N][0], the SSE writer ends, its finally removes the _request_streams entry, and the POST exchange closes — no response frame, nothing written to the event store. (3) On the client, _handle_sse_response sees the stream end without a response, but last_event_id is not None, so the early-out does not apply: it calls _handle_reconnection, which re-GETs /mcp with Last-Event-ID. (4) On the server, _handle_get_request routes the header to _replay_events (src/mcp/server/streamable_http.py:854-917): replay_events_after finds nothing after the priming id, returns the stream id, which is no longer in _request_streams, so replay_sender re-registers it, mints and stores another priming event row, and parks forever in async for event_message in msg_reader — the dispatcher already settled that id, will never write a response for it, and message_router only closes per-request streams on a RequestSettled, which will not recur for that id. (5) The client's transport task likewise parks in the reconnected GET; if that connection ever drops on a normal stream end, _handle_reconnection recurses with attempt=0, so MAX_RECONNECTION_ATTEMPTS does not bound it and it polls indefinitely.

Why nothing prevents it. The client-side early-out is explicitly scoped to the no-event-store case; the RequestSettled marker is intentionally never persisted, so a resuming client has no way to learn the request settled; and _replay_events interprets 'stream id not in _request_streams' as 'resume a live stream' rather than 'this request is gone'. The PR's new cancellation tests (hosting:http:cancel-ends-post-sse-stream, cancel-json-mode-204, and the client-side 204 test) all run without an event store, so the resumability arm is untested.

Step-by-step proof. Server: StreamableHTTPSessionManager(..., event_store=InMemoryEventStore()), one tool that blocks on an anyio.Event. Client: streamable_http_client + Client, call the blocking tool inside a cancel scope and cancel it after the handler starts (the same shape as the PR's new cancel tests, just with an event store). Sequence: priming event id=E1 arrives → client sets last_event_id=E1 → cancel → courtesy notifications/cancelled → dispatcher emits RequestSettled(N) → message_router closes the per-POST stream, exchange ends → client sees no response but last_event_id=E1, so it GETs with Last-Event-ID: E1 → replay_events_after(E1) replays nothing, returns stream id N → not in _request_streams → re-register, store priming event E2, park in msg_reader forever. Inspecting the transport afterwards shows a re-registered _request_streams[N] entry and an open GET that never closes; the client task never returns. Repeat per cancellation/timeout — connections, tasks and event-store rows accumulate until whole-session teardown. Before this PR, step 'dispatcher emits RequestSettled' was instead 'dispatcher writes JSONRPCError(code=0, "Request cancelled")', which message_router stored in the event store and routed to the stream, terminating both sides cleanly even with resumability.

Impact. Per cancelled or timed-out request on a resumability-enabled server: one permanently open GET SSE connection, a re-registered dead _request_streams entry server-side (released only at session teardown), one permanently parked client transport task, and an extra priming-event row written to the event store on each reconnect. Ordinary client request timeouts trigger the courtesy cancel, so this is hit by routine timeout handling, not just explicit user cancellation, whenever the documented resumability feature is on.

How to fix. Any of: have message_router (or the dispatcher path) also drop/terminate the event-store stream for a settled request id so a resume has something definitive to replay; have _replay_events end the GET (instead of re-registering and parking) when the replayed stream id has no live request stream and no pending request; and/or have the client not reconnect when the only event ever received was the priming event for a request whose waiter has already been retired. Adding a resumability-enabled variant of the new cancellation interaction tests would lock the behaviour in.

maxisbey force-pushed the requirements-divergences branch from 5b9582e to abf4ce4 Compare June 26, 2026 15:18

maxisbey marked this pull request as ready for review June 26, 2026 15:32

cubic-dev-ai Bot reviewed Jun 26, 2026

View reviewed changes

claude Bot reviewed Jun 26, 2026

View reviewed changes

Comment thread src/mcp/shared/jsonrpc_dispatcher.py Outdated

Comment thread src/mcp/shared/auth_utils.py

maxisbey changed the title ~~Emit RFC 6750 scope= in WWW-Authenticate and validate token audience~~ Close the spec-conformance gaps recorded by the interaction suite: auth, error codes, initialize Jun 26, 2026

cubic-dev-ai Bot reviewed Jun 26, 2026

View reviewed changes

Comment thread src/mcp/server/auth/handlers/register.py Outdated

maxisbey force-pushed the requirements-divergences branch from 8c6b78b to 0dec196 Compare June 27, 2026 00:30

maxisbey changed the title ~~Close the spec-conformance gaps recorded by the interaction suite: auth, error codes, initialize~~ Work through the spec-conformance gaps recorded by the interaction suite Jun 27, 2026

maxisbey added 13 commits June 27, 2026 18:53

maxisbey force-pushed the requirements-divergences branch from 0dec196 to c2b3e8e Compare June 27, 2026 18:56

claude Bot reviewed Jun 27, 2026

View reviewed changes

Comment thread src/mcp/shared/direct_dispatcher.py

cubic-dev-ai Bot reviewed Jun 28, 2026

View reviewed changes

claude Bot reviewed Jun 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Work through the spec-conformance gaps recorded by the interaction suite#2999

Work through the spec-conformance gaps recorded by the interaction suite#2999
maxisbey wants to merge 14 commits into
mainfrom
requirements-divergences

maxisbey commented Jun 26, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

Uh oh!

cubic-dev-ai Bot Jun 26, 2026 •

edited

Loading

Uh oh!

maxisbey Jun 28, 2026

Uh oh!

cubic-dev-ai Bot Jun 28, 2026

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

cubic-dev-ai Bot Jun 28, 2026 •

edited

Loading

Uh oh!

claude Bot Jun 28, 2026

Uh oh!

claude Bot Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

maxisbey commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Commits

Motivation

Tested

Breaking changes

Closes

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai Bot Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maxisbey Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

claude Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

maxisbey commented Jun 26, 2026 •

edited

Loading

cubic-dev-ai Bot Jun 26, 2026 •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading

cubic-dev-ai Bot Jun 28, 2026 •

edited

Loading