Skip to content

Work through the spec-conformance gaps recorded by the interaction suite#2999

Open
maxisbey wants to merge 14 commits into
mainfrom
requirements-divergences
Open

Work through the spec-conformance gaps recorded by the interaction suite#2999
maxisbey wants to merge 14 commits into
mainfrom
requirements-divergences

Conversation

@maxisbey

@maxisbey maxisbey commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

The tests/interaction/ suite is requirement-tagged against the MCP specification. Where
the SDK's behaviour fell short of the spec, the suite did not skip or xfail: it pinned the
actual behaviour and recorded the gap on the requirement entry. This PR is a burn-down of
those recorded gaps. It fixes fourteen of them in code (and deletes a fifteenth whose
behaviour had already been fixed but whose record had gone stale), narrows three more to
a precisely-stated residual, records two as deliberate, ecosystem-consistent choices, and
closes three long-standing upstream issues along the way. It spans the server's bearer-token
gate, the bundled OAuth authorization server, the OAuth client's use of discovered metadata,
the JSON-RPC error surface for cancellation / handler crashes / unhandled client callbacks,
the initialize handshake, an opt-in client-side capability pre-check, the explicit-token
progress API, and the elicitation requested-schema type. For every behaviour change, the
existing requirement-tagged test flips from pinning the gap to pinning the spec behaviour, so
the suite stays the source of truth for what conforms and what does not.

Commits

Commit / subject What it closes, by behaviour
68e4eb28 Tighten requirements-catalog divergence notes to match current SDK behaviour Bookkeeping only — no runtime change. Corrects four requirement entries whose recorded-gap notes had gone stale (one behaviour was already fixed; three were re-scoped or re-classified as intentional, cross-SDK-consistent choices).
bad42e28 Emit RFC 6750 scope= in WWW-Authenticate and validate token audience The resource server's 401/403 challenges now advertise the required scopes via scope= (which the SDK client already reads to drive step-up); a request with no credentials gets a bare Bearer challenge instead of error="invalid_token" (RFC 6750 §3.1 says the error attribute SHOULD NOT appear when nothing was presented); and a token whose RFC 8707 audience names a different server is now rejected 401 invalid_token.
47639a2b Stop replying to cancelled requests; map unhandled handler exceptions to -32603 A request cancelled via notifications/cancelled now gets no response, per the spec's SHOULD (it was answered with code: 0). An unhandled exception in a request handler now produces JSON-RPC -32603 (INTERNAL_ERROR) with an opaque message instead of code: 0 carrying str(exc) — handler internals no longer leak to the peer; the exception is still logged server-side. raise MCPError(...) for a specific code is unchanged.
a1734460 Reject bearer tokens that carry no audience claim Closes the hole the previous commit left: a verified token with no resource indicator at all was still accepted. When resource_server_url is configured the gate now fails closed (401 invalid_token). AuthSettings(verifier_validates_audience=True) opts out for verifiers that enforce aud themselves and cannot surface the claim; RefreshToken.resource is added so the audience binding survives a refresh. The SDK's own tutorials and example servers were teaching the bug and are fixed in the same commit.
884badc9 Use spec error codes for unhandled elicitation/create and roots/list When a client has no elicitation_callback / list_roots_callback, the SDK still answers the server on its behalf. Those answers now use the codes the spec assigns — -32602 for elicitation/create (a client with no callback declared no modes, and a request for an undeclared mode MUST be answered -32602) and -32601 for roots/list — instead of -32600 for both. Messages unchanged; sampling stays at -32600 (the spec assigns no code there).
24061fe9 Reject non-HTTPS, non-loopback redirect URIs at client registration The bundled authorization server's /register accepted any well-formed URL as a redirect_uris entry — cleartext http:// on a non-loopback host, javascript:, data:, fragment-carrying URIs. It now rejects anything that is not HTTPS or a loopback host (localhost, 127.0.0.1, [::1]) with 400 invalid_client_metadata, and rejects fragments (OAuth 2.1 §2.3). Separately, /token now answers an authorization-code exchange whose redirect_uri does not match the one used at /authorize with error=invalid_grant (RFC 6749 §5.2) instead of invalid_request. Closes #2629.
3eb352c8 Align OAuth client with spec on PKCE verification and scope selection (a) The client now verifies PKCE support before starting an authorization-code grant, as the spec's Authorization Code Protection section requires: if AS metadata was discovered and its code_challenge_methods_supported is absent or omits S256, the flow raises OAuthFlowError instead of redirecting. (b) The client stops reading the authorization server's scopes_supported as a scope fallback — that list is a superset of any one resource's, so the SDK over-requested and triggered real access_denied failures. The spec's chain is WWW-Authenticate scope= → PRM scopes_supported → omit. Closes #1307.
1e2bd9be Reject a second initialize on an already-initialized session A repeated initialize on a live connection was answered as a fresh handshake, silently overwriting the session's recorded client_params and negotiated protocol version — so a later check_capability answered against the second client's capabilities. It is now rejected -32600 ("Session already initialized") and the established session keeps serving. No compliant client is affected (ClientSession.initialize() is already idempotent). Legacy-only: 2026-07-28 removes initialize. Closes #2605.
8f60ebe5 Make the prompt test's opaque coverage workaround explicit Test-only: a two-line comment explaining a deliberately non-obvious shape in one prompt test, so the next reader does not "fix" it into a coverage violation.
28f500c7 Add an opt-in strict_capabilities flag to Client and ClientSession Additive, default-off. Client(..., strict_capabilities=True) rejects, before the request reaches the transport, a call to a method whose required server capability the connected server did not advertise — e.g. list_resources() against a server that only advertised tools, or subscribe_resource() when the server's resources capability does not set subscribe. The rejection is an MCPError with -32601 (METHOD_NOT_FOUND), the same code a compliant server returns for an unadvertised capability, so opting in changes where the rejection happens, not what callers catch. Mirrors the TypeScript SDK's enforceStrictCapabilities (also default-off); the same keyword-only parameter exists on ClientSession. Because the gate reads the negotiated server capabilities, combining it with a bare version pin (where the client never learns them) is refused at construction with a ValueError that names the fix. The method-to-capability table is exported as mcp_types.methods.SERVER_CAPABILITY_REQUIREMENTS, and the lifecycle capability-rule requirement entry is no longer marked untested.
eddfa29d Deprecate ServerSession.send_progress_notification send_progress_notification takes an explicit progress token decoupled from any request's lifetime, so it can keep emitting progress for a request that has already completed — which the spec forbids ("Progress notifications MUST stop after completion"). It is deprecated in favour of the request-scoped Context.report_progress() / ServerSession.report_progress(), which reports against the inbound request's own token, no-ops when the caller did not ask for progress, and is closed with the request. The deprecated method keeps working and emits MCPDeprecationWarning. To make "stops when the request completes" hold on every dispatcher, the in-process direct dispatcher now closes its context with the request the way the JSON-RPC one already did.
3605ec09 Type the elicitation requested schema on the send side ElicitRequestedSchema was a TypeAlias for dict[str, Any]; it is now a Pydantic model of the spec's restricted requested-schema subset, backed by a new PrimitiveSchemaDefinition union. ServerSession.elicit_form() (and the deprecated elicit() alias) and ClientPeer.elicit_form() accept only this model, so a nested-object property, an array-of-objects property, or an anyOf union is unconstructible at the only place a server author supplies a schema, rather than silently forwarded to the client — the spec restricts form-mode requested schemas to flat objects with primitive-typed properties ("complex nested structures, arrays of objects … are intentionally not supported"). The high-level Context.elicit() / elicit_with_validation() path is unchanged. Inbound is deliberately untouched: ElicitRequestFormParams.requested_schema stays a plain dict[str, Any] on the wire, so older servers that emit anyOf for Optional form fields still reach the client's elicitation callback.
c2b3e8ee Record two divergences as intentional, ecosystem-consistent choices Bookkeeping only — no code or test behaviour change. Two interaction-suite divergence notes described their gaps as unenforced spec MUSTs without saying whether closing them was planned; both are deliberate and the notes now say so. (1) stdio_server does not redirect sys.stdout, so a handler print() corrupts the protocol stream — no MCP SDK redirects stdout, and a redirect would only catch print(), not os.write(1, ...) or C extensions writing to fd 1, so it would be a partial guard rather than a structural fix; the logging tutorial already tells stdio server authors to log to stderr. (2) No MCP SDK validates sender-side progress monotonicity; that MUST is a contract on the handler author, not on the transport, and the test pins the unvalidated pass-through.

Motivation

These are all places where the SDK was provably out of step with normative spec text (or an
RFC the spec defers to). Three of them are reported user-facing bugs (#2629, #1307, #2605);
the auth ones matter most — a resource server that accepts audience-unbound tokens, or a
built-in authorization server that registers javascript: redirect URIs, is the kind of
default a user has no way to notice. Working through them as one burn-down keeps the
requirement manifest honest: a gap that is fixed loses its record, a gap that is partially
closed gets a record stating exactly the residual, and a gap that is a deliberate choice
says so — so what remains recorded is real and current.

Tested

./scripts/test (the full suite under the 100% line+branch coverage gate plus
strict-no-cover) is green at HEAD: 4303 passed, 7 skipped, 1 xfailed (the one
xfail is pre-existing and unrelated — a Pydantic AnyUrl trailing-slash quirk in
tests/client/test_auth.py), coverage 100.00%. pyright reports 0 errors;
ruff check / ruff format are clean; the new-suppression audit
(git diff ... | grep -E '^\+.*(pragma|type: ignore|noqa)') is empty. Every behaviour
change flips an existing requirement-tagged interaction test rather than only adding a new
one, so the old behaviour cannot silently come back. New tests cover the fail-closed
audience gate (including the verifier_validates_audience opt-out and the refresh chain),
the loopback/fragment matrix at /register, the PKCE-refusal arms (absent field, list
without S256, no-metadata leniency), the double-initialize rejection on stdio, stateful
HTTP, and stateless HTTP, the strict_capabilities pre-wire rejection alongside the
default send-and-surface behaviour, the request-scoped progress close on both dispatchers,
and the typed ElicitRequestedSchema construction/rejection matrix and its to_wire()
round-trip.

Breaking changes

All documented in docs/migration.md (one section each):

  • Bearer tokens are rejected unless their audience names this server — including tokens
    that carry no resource indicator at all. Populate AccessToken.resource in your
    TokenVerifier (recommended; the examples now show it), or set
    AuthSettings(verifier_validates_audience=True) if your verifier already enforces aud
    and cannot surface it. resource_server_url=None still disables the check.
  • Bundled authorization server: RFC-correct redirect-URI handling/register rejects
    non-HTTPS, non-loopback, and fragment-carrying redirect URIs with invalid_client_metadata
    (this also rejects RFC 8252 private-use schemes such as com.example.app:/callback; MCP
    allows only HTTPS or loopback); /token answers a redirect-URI mismatch with
    invalid_grant instead of invalid_request.
  • OAuth client refuses to authorize when AS metadata does not advertise S256 PKCE
    OAuthFlowError instead of proceeding. No SDK-side opt-out: an authorization server that
    supports S256 but omits the field needs its published metadata fixed (RFC 8414 §2).
  • OAuth client no longer reads scopes_supported from authorization-server metadata to
    choose a scope
    — if that was your only scope source, the client now omits scope
    entirely; pass an explicit scope on OAuthClientMetadata to keep the old request.
  • Unhandled handler exceptions return -32603; cancelled requests get no reply — code
    that matched on the previous code: 0 responses must update; raise MCPError for a
    specific code, and note the exception text is now logged, not sent.
  • Unhandled elicitation/create returns -32602; unhandled roots/list returns
    -32601
    — server code that branched on error.code == -32600 to detect a client
    without these should switch codes, or (better) check the client's declared capabilities
    before sending.
  • A second initialize on an already-initialized session is rejected with -32600
    ("Session already initialized"). A peer that needs a fresh handshake opens a new
    connection (on streamable HTTP, a POST without Mcp-Session-Id).
  • ServerSession.elicit_form() (and the deprecated elicit() alias) and
    ClientPeer.elicit_form() now take a typed mcp_types.ElicitRequestedSchema, not an
    arbitrary dict[str, Any]
    ElicitRequestedSchema is the model now, no longer a
    TypeAlias for dict[str, Any]. Build it directly
    (ElicitRequestedSchema(properties={"x": StringSchema(type="string")}, required=["x"]))
    or validate an existing JSON Schema dict with ElicitRequestedSchema.model_validate(...).
    Schemas with nested-object properties, array-of-objects properties, or anyOf unions are
    rejected at construction instead of being forwarded. The high-level Context.elicit() /
    elicit_with_validation() path and the inbound/wire representation are unchanged.
  • ServerSession.send_progress_notification() is deprecated in favour of the
    request-scoped Context.report_progress() / ServerSession.report_progress(). The
    method still works, but every call now emits mcp.MCPDeprecationWarning — so test suites
    that run with filterwarnings = ["error"] (or -W error) will start failing on it.
    report_progress uses the inbound request's own token and cannot emit progress after that
    request has completed, which is what the spec requires.

Not a breaking change, but called out for completeness: Client(..., strict_capabilities=)
(and the same keyword on ClientSession) is additive and opt-in. The default is False
and the default behaviour is byte-for-byte unchanged — every request is still sent and the
server's answer is still surfaced.

Closes

Closes #2629
Closes #1307
Closes #2605

AI Disclaimer

@maxisbey maxisbey force-pushed the requirements-divergences branch from 5b9582e to abf4ce4 Compare June 26, 2026 15:18
@maxisbey maxisbey marked this pull request as ready for review June 26, 2026 15:32

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 29 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/mcp/shared/auth_utils.py">

<violation number="1" location="src/mcp/shared/auth_utils.py:32">
P2: Audience canonicalization can throw `ValueError` on malformed resource ports, breaking auth flow with a server error.</violation>

<violation number="2" location="src/mcp/shared/auth_utils.py:83">
P2: Audience validation is over-permissive because trailing slashes are stripped before comparison, so different resource URIs like `/api` and `/api/` are treated as equivalent.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic

Comment thread src/mcp/shared/auth_utils.py
Comment thread src/mcp/shared/auth_utils.py Outdated
origin is NOT for this server. Contrast check_resource_allowed, which is the
client-side hierarchical question and intentionally more permissive.
"""
return resource_url_from_server_url(token_resource).rstrip("/") == resource_url_from_server_url(

@cubic-dev-ai cubic-dev-ai Bot Jun 26, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Audience validation is over-permissive because trailing slashes are stripped before comparison, so different resource URIs like /api and /api/ are treated as equivalent.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/mcp/shared/auth_utils.py, line 83:

<comment>Audience validation is over-permissive because trailing slashes are stripped before comparison, so different resource URIs like `/api` and `/api/` are treated as equivalent.</comment>

<file context>
@@ -65,6 +72,19 @@ def check_resource_allowed(requested_resource: str, configured_resource: str) ->
+    origin is NOT for this server. Contrast check_resource_allowed, which is the
+    client-side hierarchical question and intentionally more permissive.
+    """
+    return resource_url_from_server_url(token_resource).rstrip("/") == resource_url_from_server_url(
+        server_resource
+    ).rstrip("/")
</file context>
Fix with cubic

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right at the RFC 3986 layer — strict equivalence distinguishes /api/ from /api. But RFC 8707 defines no comparison algorithm, and its own examples use trailing-slash resource URIs. Two things make the normalization load-bearing: pydantic's AnyHttpUrl renders a root resource as https://h/ while the spec's example token request sends the slashless form — strict comparison would 401 every conformant client of a root-path deployment. And the compared value is the verifier-asserted audience from the validated token, not raw client input; exploiting the leniency would need two distinct resource servers at one origin differing only by a trailing slash, which a Starlette mount cannot host. Keeping the rstrip; c53aefd adds a comment plus tests pinning the equivalence and the sibling/child-path rejections.

AI Disclaimer

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parent comment was wrong here: this audience comparison is intentionally load-bearing for RFC 8707 resource validation. The root-path and verifier-asserted-audience cases make the trailing-slash normalization necessary, so keeping the rstrip('/') is correct.

Comment thread src/mcp/shared/jsonrpc_dispatcher.py Outdated
Comment thread src/mcp/shared/auth_utils.py
@maxisbey maxisbey changed the title Emit RFC 6750 scope= in WWW-Authenticate and validate token audience Close the spec-conformance gaps recorded by the interaction suite: auth, error codes, initialize Jun 26, 2026

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 36 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/mcp/shared/auth_utils.py">

<violation number="1" location="src/mcp/shared/auth_utils.py:83">
P2: Audience validation is over-permissive because trailing slashes are stripped before comparison, so different resource URIs like `/api` and `/api/` are treated as equivalent.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic

Comment thread src/mcp/server/auth/handlers/register.py Outdated
@maxisbey maxisbey force-pushed the requirements-divergences branch from 8c6b78b to 0dec196 Compare June 27, 2026 00:30
@maxisbey maxisbey changed the title Close the spec-conformance gaps recorded by the interaction suite: auth, error codes, initialize Work through the spec-conformance gaps recorded by the interaction suite Jun 27, 2026
maxisbey added 13 commits June 27, 2026 18:53
…haviour

- resources:annotations: drop the stale lastModified divergence; the model
  now carries the field and it round-trips.
- lifecycle:capability:client-not-declared: narrow to the one remaining gap
  (the deprecated send_roots_list_changed path); the handler half is correct
  by construction.
- lifecycle:pre-initialization-ordering: mark removed_in=2026-07-28; the
  initialize handshake is gone in the new spec.
- client-transport:http:session-404-reinitialize: reword as an intentional
  cross-SDK choice (404 surfaced to caller), not a missed MUST.
- test_resources: drop the now-redundant lastModified xfail.
BearerAuthBackend / RequireAuthMiddleware now produce spec-conformant
challenges and reject tokens issued for a different resource server.

- A request with no credentials gets a bare `Bearer` challenge (with
  scope/resource_metadata only), not error="invalid_token" -- RFC 6750
  Section 3.1 says the error attribute SHOULD NOT appear when no
  authentication information was presented.
- A malformed/unknown token, an expired token, or a token whose audience
  does not match the configured resource_server_url is answered 401
  invalid_token with a specific error_description, carried via a new
  InvalidTokenUser marker so the middleware can distinguish it from
  no-credentials.
- All challenges (401 and the 403 insufficient_scope path) now advertise
  the required scopes in a `scope=` parameter, which the SDK client
  already reads to drive step-up.
- New check_token_audience() helper canonicalises default ports before
  comparing, and is wired through both the lowlevel and MCPServer
  Starlette stacks via the auth settings' resource_server_url.

Docs and migration guide updated; the corresponding interaction-suite
divergence entries are now closed.
… to -32603

A request cancelled via notifications/cancelled now gets no response: when
the handler scope's cancel is caught, JSONRPCDispatcher returns instead of
writing an error. The sender retired its own waiter when it cancelled, so
no reply is needed to unblock it.

An unhandled exception in a request handler now produces JSON-RPC error
-32603 (INTERNAL_ERROR) with the opaque message "Internal server error"
instead of code 0 carrying str(exc). The exception is still logged
server-side. To send a specific code/message, raise MCPError; pydantic
ValidationError still maps to INVALID_PARAMS.

handler_exception_to_error_data is now total: it returns
MappedError(error: ErrorData, unexpected: bool) for every Exception.
Callers gate logger.exception / raise_handler_exceptions on the
unexpected flag rather than re-deriving the rung set, so a handler that
deliberately raises MCPError(code=INTERNAL_ERROR) is not treated as a
crash. JSONRPCDispatcher, DirectDispatcher, and the modern HTTP entry's
_to_jsonrpc_response all call the one helper; DirectDispatcher no longer
hand-rolls its own ladder.

The interaction suite's protocol:cancel:in-flight and
protocol:error:internal-error requirements drop their divergence entries
and modern-error-surface arm exclusions; the two prompt-validation
divergence notes are reworded for the new -32603 surface.
docs/migration.md gains a section covering both behaviour changes.
When `AuthSettings.resource_server_url` is configured, `BearerAuthBackend`
previously ran the RFC 8707 audience comparison only for tokens whose
verifier populated `AccessToken.resource`: a token carrying no resource
indicator at all was accepted. The MCP authorization spec requires a
resource server to only accept tokens issued specifically for it, so the
gate now fails closed: a verified token with no `resource` is answered
`401 invalid_token` ("The access token carries no audience claim").
`resource_server_url=None` still means there is no audience to enforce.

For verifiers that validate the audience themselves and cannot surface
the claim (for example a JWT decoder configured with the expected
audience), the new `AuthSettings.verifier_validates_audience=True` opts
the gate out. The `AuthSettings.enforced_audience` property derives the
single value both server wirings pass to `BearerAuthBackend`, whose
signature is unchanged.

`RefreshToken` gains an optional `resource` field so an authorization
server provider can carry the original grant's audience binding through
`exchange_refresh_token`; without it every refreshed access token would
be audience-unbound and rejected by the hardened gate.

The docs tutorials and example servers now populate
`AccessToken.resource` (and the client-credentials demo token endpoint
honors the RFC 8707 `resource` parameter) so they pass the check they
teach. The migration guide entry for audience validation is rewritten
for the fail-closed behavior.
A client constructed without an elicitation_callback or
list_roots_callback still answers the server's request, via a default
callback that returns a JSON-RPC error. Both defaults used -32600
(invalid request). The spec assigns a specific code to each case:

- elicitation/create: -32602 (invalid params). A client with no
  callback declares no elicitation modes, so every incoming request
  names an undeclared mode, which clients MUST answer with -32602.
- roots/list: -32601 (method not found), the code clients SHOULD use
  when they do not support roots.

The default sampling callback keeps -32600: the spec assigns no code
to a client that does not support sampling. Error messages are
unchanged.

Update the affected tests and the interaction-requirement entries that
recorded the old codes, fix the code named in the client callbacks doc
page, and add a migration note.
Two fixes to the optional bundled OAuth authorization server (the
`auth_server_provider=` path).

The registration endpoint accepted any well-formed URL as a
`redirect_uris` entry: cleartext `http://` on a non-loopback host,
`javascript:`, `data:`, and URIs carrying a fragment all registered
successfully. The MCP authorization specification's Communication
Security section requires every redirect URI to be either localhost or
HTTPS, and OAuth 2.1 section 2.3 forbids a fragment component. Such an
entry is now rejected with `400 invalid_client_metadata`. Loopback is
exactly the three forms OAuth 2.1 section 8.4.2 names (`localhost`,
`127.0.0.1`, `[::1]`), on any port; query strings remain permitted.
This also rejects RFC 8252 private-use schemes such as
`com.example.app:/callback`: MCP restricts redirect URIs to HTTPS or
loopback, with no carve-out for native apps.

The rule lives on the request model: `RegistrationRequest`, until now a
dead alias of `OAuthClientMetadata`, becomes a real subclass with a
`redirect_uris` field validator, so a forbidden URI fails parsing and
takes the handler's existing `invalid_client_metadata` arm rather than
needing a post-parse check. `OAuthClientMetadata` itself is unchanged:
the client also serializes it when registering against third-party
authorization servers whose redirect-URI policies the SDK does not own.
The loopback host set moves to a single `LOOPBACK_HOSTS` constant in
`mcp.server.auth.provider`, shared with `validate_issuer_url`, which
previously inlined the same tuple.

Separately, the token endpoint now answers an authorization-code
exchange whose `redirect_uri` does not match the one used at
`/authorize` with `error=invalid_grant` instead of `invalid_request`.
RFC 6749 section 5.2 assigns this case to `invalid_grant` ("does not
match the redirection URI used in the authorization request"), and the
handler's other authorization-code failures already use it. The
exchange was already rejected with HTTP 400; only the `error` field
changes.

Update the affected tests and the interaction-requirement entries, and
add a migration note.

Closes #2629
Two changes to how the OAuth client uses discovered authorization-server
metadata, both required by the MCP authorization specification.

Verify PKCE support before the authorization-code grant. The spec's
Authorization Code Protection section requires clients to verify PKCE
support from the authorization server's metadata and to refuse to
proceed when code_challenge_methods_supported is absent. The new
validate_pkce_support() also refuses a method list that omits S256,
since that is the only method this client sends. The check sits at the
top of _perform_authorization_code_grant, so it covers both the initial
401 flow and the 403 insufficient_scope step-up, while grants that never
issue an authorization code (client credentials, private key JWT) are
unaffected. When no metadata document was discovered at all the flow
proceeds as before: absence of a document is not evidence of
non-support.

Stop reading scopes_supported from authorization-server metadata when
selecting a scope. The spec's scope-selection chain is the
WWW-Authenticate scope parameter, then the protected-resource metadata's
scopes_supported, otherwise omit the scope parameter. The SDK inserted
an extra fallback to the authorization server's scopes_supported, which
over-requests (an authorization server may serve many resource servers,
so its list is a superset of any one resource's) and causes
access_denied failures against servers that reject unknown scopes. With
the fallback removed, clients that relied on it should pass an explicit
scope on their OAuthClientMetadata.

Closes #1307
A server that had already completed the initialize handshake on a
connection answered a repeated initialize request on that same
connection as a fresh handshake, silently overwriting the session's
recorded client_params and negotiated protocol version. A
check_capability call made after that point then answered against the
second client's declared capabilities.

The handshake now commits at most once per connection: a repeated
initialize is answered with JSON-RPC error -32600 (INVALID_REQUEST,
"Session already initialized") and the established session keeps
serving. The check lives at the runner's request-dispatch boundary,
right next to its mirror (the request-before-initialize gate), so every
transport gets it without any transport-level body sniffing.

The discriminator is client_params rather than initialize_accepted: the
legacy stateless path builds a per-request connection that is born past
the gate but has no peer info yet, and its one initialize must still be
accepted.

No compliant client is affected. The spec makes initialization the
first interaction of a session, and ClientSession.initialize() is
already idempotent (a repeat call returns the first result without
sending anything). This only applies to the legacy (2025-11-25 and
earlier) handshake; the 2026-07-28 protocol removes initialize
entirely.

Closes #2605
The never-called prompt body in the wrong-type-argument test proves it
never ran by appending to a closure-captured list, but a plain append on
an unreachable line would fail the 100% coverage gate. The append
therefore rides on a `raise NotImplementedError` line, which coverage's
`exclude_also` strips. That shape read as an accident; add a comment
spelling out why it is written that way.
Client(..., strict_capabilities=True) rejects, before any request reaches
the transport, a call to a method whose required server capability the
connected server did not advertise -- for example list_resources() against
a server that only advertised tools, or subscribe_resource() when the
server's resources capability does not set subscribe. The rejection is an
MCPError with code -32601 (METHOD_NOT_FOUND) and data set to the method,
the same shape a compliant server returns for an unadvertised capability,
so opting in changes where the rejection happens, not what callers catch.

The default is False and unchanged: every request is sent and the server's
answer is surfaced. This mirrors the TypeScript SDK's
enforceStrictCapabilities option (also default-off). The same keyword-only
parameter exists on ClientSession for low-level users; Client forwards it.

The method-to-capability table lives in
mcp_types.methods.SERVER_CAPABILITY_REQUIREMENTS next to the other
per-method maps, with missing_server_capability() as its only evaluator,
so the check in ClientSession.send_request is a single data-driven gate
rather than a per-method condition, and the relationship is the same at
every protocol version. Because the gate reads server_capabilities, a bare
version pin (mode="2026-07-28" with no prior_discover=) would reject every
gated method; that combination is refused at Client construction with a
ValueError that names the fix.

The interaction-requirements entry for the lifecycle capability rule is no
longer marked untested: the new tests pin both the opt-in pre-wire
rejection and the default send-and-surface behaviour.
`ServerSession.send_progress_notification` takes an explicit progress token
decoupled from the request it belongs to, so it can keep emitting progress
for a request that has already completed -- which the spec forbids
("Progress notifications MUST stop after completion"). The request-scoped
`report_progress` (and `Context.report_progress`) is the supported path: it
reports against the inbound request's own token, no-ops when the caller did
not ask for progress, and stops when the request completes. The deprecated
method keeps working and emits `MCPDeprecationWarning`.

The warning message deliberately departs from the "<X> is deprecated as of
<version>" pattern used by the spec-driven deprecations: this one is an SDK
API decision, not a spec retirement (2026-07-28 does not retire
server-to-client progress).

For "stops when the request completes" to hold on every dispatcher,
`_DirectDispatchContext` now closes with its request the way
`_JSONRPCDispatchContext` already did: `close()` runs in the dispatch
handler's `finally`, after which `progress`/`notify` deliver nothing,
`can_send_request` is False, and `send_raw_request` raises
`NoBackChannelError` -- the closed state the `DispatchContext` protocol
documents. Two pre-existing tests that asserted `can_send_request` on a
context captured after its handler returned now sample it in-handler, and
the closed-state contract tests are parametrized over both dispatchers.

The interaction test that covered both the server and client side of late
progress is split in two. The server-side property is proved positively on
the wire through `report_progress`; it no longer relies on a session-bound
standalone stream, so it also runs on the stateless streamable-http arm.
The client-side late-drop test keeps using the deprecated explicit-token
method -- the only API that can still produce a late notification -- under
`pytest.warns`, and its arms are unchanged. The migration guide stops
recommending the deprecated method anywhere and documents the replacement.
ElicitRequestedSchema was a TypeAlias for dict[str, Any]; it is now a
Pydantic model of the spec's restricted requested-schema subset, backed
by a new PrimitiveSchemaDefinition union (StringSchema, NumberSchema,
BooleanSchema, and the enum schemas). ServerSession.elicit_form (and the
deprecated elicit alias) and ClientPeer.elicit_form accept only this
model, so a nested-object property, an array-of-objects property, or an
anyOf union is unconstructible at the only place a server author
supplies a schema, rather than silently forwarded to the client.

The spec restricts form-mode requested schemas to flat objects with
primitive-typed properties only ("complex nested structures, arrays of
objects ... are intentionally not supported"). The high-level
Context.elicit / elicit_with_validation path is unchanged in behaviour:
it converts the rendered JSON Schema into the typed model, keeping its
existing per-field TypeError contract and producing value-identical wire
output.

Inbound is deliberately untouched. The wire field
ElicitRequestFormParams.requested_schema stays a plain dict[str, Any],
so older servers that emit anyOf for Optional form fields still reach
the client's elicitation callback. The typed-model-to-wire-dict
conversion lives in one place, ElicitRequestedSchema.to_wire(), which
both send sites call.

The schema family is extra="allow": keys schema.ts does not name (a
top-level title, pattern, exclusiveMinimum, json_schema_extra keys)
still round-trip, because the primitives-only restriction is carried by
the union members' required type literals, not by extra-key rejection.
Two interaction-suite divergence notes described their gaps as
unenforced spec MUSTs without saying whether closing them was planned.
Both are deliberate, and the notes now say so.

transport:stdio:stream-purity -- stdio_server does not redirect
sys.stdout, so a handler print() corrupts the protocol stream. No MCP
SDK redirects stdout, and a redirect would only catch print(), not
os.write(1, ...) or C extensions writing to file descriptor 1, so it
would be a partial guard rather than a structural fix.
docs/tutorial/logging.md already tells server authors to log to stderr
and never print() in a stdio server, so no docs change was needed.

protocol:progress:monotonic -- no MCP SDK validates sender-side
progress monotonicity; the spec MUST is a contract on the handler
author, not on the transport, and the test pins the unvalidated
pass-through.

No code or test behaviour changes.
@maxisbey maxisbey force-pushed the requirements-divergences branch from 0dec196 to c2b3e8e Compare June 27, 2026 18:56
Comment thread src/mcp/shared/direct_dispatcher.py
Review-feedback round on the conformance burn-down:

- Cancelled requests no longer leave the legacy streamable-HTTP POST
  hanging. The dispatcher emits a RequestSettled marker when a handler is
  cancelled without producing a response; the transport consumes it by
  closing the per-request stream, so the POST's SSE stream terminates
  without a response frame and JSON-response mode completes with 204 No
  Content (the client treats 202/204 alike). Per-request streams are
  released instead of leaking until session teardown, and a handler that
  survives the cancellation still delivers its normal response. The
  marker is type-visible on the dispatcher write stream and is stripped
  by every serializing transport, so it can never appear on a wire.
- A bearer token whose audience cannot be canonicalized (out-of-range or
  non-numeric port) is now rejected with the standard 401 invalid_token
  instead of raising through the auth middleware as a 500.
- The bundled authorization server's /register now accepts only https
  redirect URIs or http on a loopback host; other schemes on loopback
  hosts (ftp, ws, javascript, custom) are rejected.
- OAuth client scope selection falls back to the caller-configured
  OAuthClientMetadata.scope when neither the WWW-Authenticate challenge
  nor protected-resource metadata names scopes, matching the TypeScript
  SDK, so the documented migration path works as written.
- The cross-dispatcher contract that handler-raised MCPError subclasses
  surface to callers as plain MCPError is now pinned by an explicit test
  and documented; rehydrate with from_error when the subclass matters.
- Docs: migration notes for the bearer-challenge wire-shape changes and
  the cancellation wire spellings; story READMEs updated to the landed
  error contract; strict-capabilities doc corrected to state that
  resources/unsubscribe is gated by the base resources capability only.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 46 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/mcp/shared/direct_dispatcher.py">

<violation number="1" location="src/mcp/shared/direct_dispatcher.py:111">
P3: This overstates the DirectDispatcher contract: notification handlers are not flattened to plain MCPError. Narrow the doc to request dispatch paths, or update notification handling separately.</violation>
</file>

Tip: Review your code locally with the cubic CLI to iterate faster.

Fix all with cubic | Re-trigger cubic

handler. Notifications are fire-and-forget in both directions: after close
they are silently dropped.

Handler-raised `MCPError` subclasses flatten to plain `MCPError` with equal

@cubic-dev-ai cubic-dev-ai Bot Jun 28, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: This overstates the DirectDispatcher contract: notification handlers are not flattened to plain MCPError. Narrow the doc to request dispatch paths, or update notification handling separately.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/mcp/shared/direct_dispatcher.py, line 111:

<comment>This overstates the DirectDispatcher contract: notification handlers are not flattened to plain MCPError. Narrow the doc to request dispatch paths, or update notification handling separately.</comment>

<file context>
@@ -107,6 +107,11 @@ class DirectDispatcher:
     handler. Notifications are fire-and-forget in both directions: after close
     they are silently dropped.
+
+    Handler-raised `MCPError` subclasses flatten to plain `MCPError` with equal
+    `ErrorData` on every dispatch path, matching the wire, where subclass
+    identity cannot survive; callers needing the subclass rehydrate it from
</file context>
Fix with cubic

Comment on lines +268 to 274
if response.status_code in (202, 204):
# 202: notification/response accepted. 204: the server settled the
# request with no reply (peer-cancelled); nothing to deliver — the
# waiter was retired at cancel time.
logger.debug(f"Received {response.status_code}")
return

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The client now silently swallows any 204 on a request POST (src/mcp/client/streamable_http.py:268-274) on the assumption the waiter was already retired at cancel time, but the server's JSON-response mode also sends 204 when the per-request stream closes for non-cancel reasons (session teardown mid-request, idle-timeout eviction, operator DELETE), and a non-SDK server may answer a request POST with 204 outright — in those cases the client's waiter is still pending and nothing is ever delivered, so with the default read_timeout_seconds=None the caller hangs forever where it previously got a fail-fast error. Consider only swallowing the 204 when the request id is no longer pending, and otherwise retiring the waiter with a synthesized error (a synthesized error for an already-retired cancel waiter is dropped harmlessly).

Extended reasoning...

The bug. _handle_post_request in src/mcp/client/streamable_http.py (lines 268–274) now treats 202 and 204 identically: it logs and returns, with the comment that for a 204 "the waiter was retired at cancel time." That justification only holds when this client initiated the cancellation (its send_raw_request finally retires the pending waiter before the courtesy notifications/cancelled goes out). A 204 has other producers for which the waiter is very much still pending, and for those the silent return means nothing — no response, no error — is ever delivered to the dispatcher for that request id.

The non-cancel 204 producers. On the SDK server side, the JSON-response-mode wait loop in src/mcp/server/streamable_http.py (~lines 564–587) sends 204 whenever the per-request stream closes without a JSONRPCResponse/JSONRPCError frame — and its own comment names two causes: "peer-cancelled … or the session tore down mid-request." terminate() (and connect()'s cleanup) closes the send side of every in-flight per-request stream without routing any reply, so a server-initiated session termination — StreamableHTTPSessionManager's session_idle_timeout evicting a session whose handler outlives the idle deadline, an operator/external DELETE on the session id, or session-manager shutdown — while a client request is in flight produces exactly this 204. Separately, a non-SDK server may (non-compliantly) answer a request POST with 204 directly. In none of these cases did the client cancel anything.

Why nothing else saves the caller. The dispatcher only fails pending waiters when the ClientSession itself closes; the client's read stream stays open after the swallowed 204, and session death is only discovered by later requests. Client and ClientSession both default read_timeout_seconds=None, so the original caller's send_request blocks indefinitely.

Pre-PR vs post-PR (the regression). Before this PR, the same server condition produced a 500 ("No response message received before stream closed"), which the client's ≥400 path mapped to an INTERNAL_ERROR JSONRPCError delivered to the waiter — fail fast. A non-SDK 204 to a request POST hit the unexpected-content-type/parse path and likewise failed the waiter. Both paths now regress from "caller gets an error" to "caller hangs forever." The new test coverage only exercises the cancel flow (where the waiter is already retired), so the regression is not pinned by the suite.

Step-by-step proof. (1) Server runs streamable HTTP with json_response=True and an idle timeout; client issues tools/call (id 1) whose handler runs longer than the idle deadline. (2) The session manager's idle deadline fires (it is only pushed forward by inbound requests), cancels the serve loop, and calls terminate(), which closes every per-request stream; the parked JSON wait loop exits with response_message=None and the POST completes 204. (3) The client's _handle_post_request sees 204, logs "Received 204", and returns — the waiter for id 1 was never retired because the client never cancelled. (4) Nothing else ever resolves that waiter; with the default read_timeout_seconds=None the caller's await client.call_tool(...) blocks forever. Pre-PR, step (2) yielded a 500 and step (3) delivered an INTERNAL_ERROR to the waiter immediately.

Mitigating factors. json_response=True is non-default; the default SSE mode already had an analogous silent-hang shape pre-PR (stream ends with no event id → task returns); and the mainline cancel flow handled by this PR is correct and requirement-pinned (client-transport:http:204-settled-exchange). So this is a narrow trigger — but it is a hang-vs-error regression introduced by this PR, and an indefinite hang is strictly worse than the prior error surface.

How to fix. Only swallow the 204 when the request id is no longer pending in the dispatcher (i.e. this client cancelled it), and otherwise retire the waiter by writing a synthesized JSONRPCError for message.id to read_stream_writer (e.g. INTERNAL_ERROR, "Request settled by server with no response"). In the legitimate cancel case the synthesized error is harmless — the waiter is already retired and late responses for unknown ids are dropped — so unconditionally synthesizing on 204 for request POSTs would also restore fail-fast without breaking the cancel flow.

Comment on lines 978 to +989
async def message_router():
try:
async for session_message in write_stream_reader: # pragma: no branch
if isinstance(session_message, RequestSettled):
# The dispatcher ended this request with no reply (peer-cancelled).
# Close the per-POST stream's send side: the SSE writer / JSON wait
# loop finish their iteration and clean up in their own finally.
# Nothing is stored or sent on the wire. Keyed by str(id), matching
# the registration at POST time.
if (streams := self._request_streams.get(str(session_message.request_id))) is not None:
streams[0].close()
continue

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 With resumability (an event_store) enabled, a cancelled or timed-out request re-creates the hang/leak the new RequestSettled marker was added to fix: the priming event gives the client a last_event_id, so when the settled POST stream ends with no response it reconnects with Last-Event-ID, and _replay_events (having nothing to replay and nothing recorded for the settled id) re-registers a fresh _request_streams entry and parks forever — leaving one permanently open GET, a dead per-request stream on the server, a parked client transport task, and an extra priming row in the event store per cancellation. Pre-PR the stored code-0 'Request cancelled' error terminated both sides even with resumability, so this is a regression on that arm; a fix is to record a terminator (or drop the stream) in the event store for a settled id, have _replay_events end the GET instead of re-registering an id with no live request, and/or have the client skip reconnection when the only event seen was the priming event for an already-retired waiter.

Extended reasoning...

The bug. The new RequestSettled marker (emitted by the dispatcher when a peer cancellation interrupts a handler) is consumed in message_router (src/mcp/server/streamable_http.py:978-989): it closes the per-POST stream's send side and deliberately stores nothing in the event store ('Nothing is stored or sent on the wire'). That cleanly ends the exchange when no event store is configured — which is exactly the case the client's new no-reconnect early-out covers (src/mcp/client/streamable_http.py:383-388, whose own comment says 'No priming event (no event store) means no id'). With an event store configured, the same cancellation re-creates the hang the marker was added to fix, plus a per-cancellation server-side leak.

The code path. (1) A 2025-11-25+ client POSTs tools/call (id=N) to a server constructed with an event_store. _handle_post_request mints a priming event with an SSE id (_mint_priming_event, src/mcp/server/streamable_http.py:285-301) and _run_sse_writer sends it first; the client records it as last_event_id (_handle_sse_response, src/mcp/client/streamable_http.py:362-363; _handle_sse_event treats the empty-data priming event as non-complete). (2) The caller times out or scope-cancels; the dispatcher sends the courtesy notifications/cancelled, the handler is interrupted, and the dispatcher emits RequestSettled. message_router closes _request_streams[N][0], the SSE writer ends, its finally removes the _request_streams entry, and the POST exchange closes — no response frame, nothing written to the event store. (3) On the client, _handle_sse_response sees the stream end without a response, but last_event_id is not None, so the early-out does not apply: it calls _handle_reconnection, which re-GETs /mcp with Last-Event-ID. (4) On the server, _handle_get_request routes the header to _replay_events (src/mcp/server/streamable_http.py:854-917): replay_events_after finds nothing after the priming id, returns the stream id, which is no longer in _request_streams, so replay_sender re-registers it, mints and stores another priming event row, and parks forever in async for event_message in msg_reader — the dispatcher already settled that id, will never write a response for it, and message_router only closes per-request streams on a RequestSettled, which will not recur for that id. (5) The client's transport task likewise parks in the reconnected GET; if that connection ever drops on a normal stream end, _handle_reconnection recurses with attempt=0, so MAX_RECONNECTION_ATTEMPTS does not bound it and it polls indefinitely.

Why nothing prevents it. The client-side early-out is explicitly scoped to the no-event-store case; the RequestSettled marker is intentionally never persisted, so a resuming client has no way to learn the request settled; and _replay_events interprets 'stream id not in _request_streams' as 'resume a live stream' rather than 'this request is gone'. The PR's new cancellation tests (hosting:http:cancel-ends-post-sse-stream, cancel-json-mode-204, and the client-side 204 test) all run without an event store, so the resumability arm is untested.

Step-by-step proof. Server: StreamableHTTPSessionManager(..., event_store=InMemoryEventStore()), one tool that blocks on an anyio.Event. Client: streamable_http_client + Client, call the blocking tool inside a cancel scope and cancel it after the handler starts (the same shape as the PR's new cancel tests, just with an event store). Sequence: priming event id=E1 arrives → client sets last_event_id=E1 → cancel → courtesy notifications/cancelled → dispatcher emits RequestSettled(N)message_router closes the per-POST stream, exchange ends → client sees no response but last_event_id=E1, so it GETs with Last-Event-ID: E1replay_events_after(E1) replays nothing, returns stream id N → not in _request_streams → re-register, store priming event E2, park in msg_reader forever. Inspecting the transport afterwards shows a re-registered _request_streams[N] entry and an open GET that never closes; the client task never returns. Repeat per cancellation/timeout — connections, tasks and event-store rows accumulate until whole-session teardown. Before this PR, step 'dispatcher emits RequestSettled' was instead 'dispatcher writes JSONRPCError(code=0, "Request cancelled")', which message_router stored in the event store and routed to the stream, terminating both sides cleanly even with resumability.

Impact. Per cancelled or timed-out request on a resumability-enabled server: one permanently open GET SSE connection, a re-registered dead _request_streams entry server-side (released only at session teardown), one permanently parked client transport task, and an extra priming-event row written to the event store on each reconnect. Ordinary client request timeouts trigger the courtesy cancel, so this is hit by routine timeout handling, not just explicit user cancellation, whenever the documented resumability feature is on.

How to fix. Any of: have message_router (or the dispatcher path) also drop/terminate the event-store stream for a settled request id so a resume has something definitive to replay; have _replay_events end the GET (instead of re-registering and parking) when the replayed stream id has no live request stream and no pending request; and/or have the client not reconnect when the only event ever received was the priming event for a request whose waiter has already been retired. Adding a resumability-enabled variant of the new cancellation interaction tests would lock the behaviour in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant