Skip to content

[do not merge] Add mcp-codemod, an automated v1 to v2 migration tool#3011

Draft
maxisbey wants to merge 1 commit into
mainfrom
mcp-codemod
Draft

[do not merge] Add mcp-codemod, an automated v1 to v2 migration tool#3011
maxisbey wants to merge 1 commit into
mainfrom
mcp-codemod

Conversation

@maxisbey

Copy link
Copy Markdown
Contributor

Important

DO NOT MERGE. Opening as a draft for design review. The codemod is self-contained
(a new workspace package; nothing in mcp depends on it), but the scope, the mapping
tables, and the publishing prerequisite below deserve eyes before any of it is real.

Adds mcp-codemod, a libCST-based tool that automates the mechanical half of the v1 to v2 migration:

uvx mcp-codemod v1-to-v2 ./src
grep -rn '# mcp-codemod:' ./src   # everything left for a human

It rewrites every change whose meaning is unambiguous from the file alone, and inserts a
# mcp-codemod: comment above every site it recognized but would not guess at, so the
remaining work is one grep. Re-running on its own output is a no-op, and --dry-run
(optionally with --diff) previews a run without writing anything.

Motivation and Context

docs/migration.md is ~1,600 lines, and most of what it asks for is tedium rather than
judgment: import moves, symbol renames, and the camelCase to snake_case field renames that
touch nearly every file. More prose cannot reduce tedium. A codemod removes exactly that
half, so a reader's (or an agent's) attention goes to the changes that actually need it.
The TypeScript SDK already ships @modelcontextprotocol/codemod as step 1 of its upgrade
guide; this is the Python counterpart.

What it rewrites (each gated on resolving a name through the file's imports, never on
matching text, so an aliased import or an unrelated symbol with the same name is never
touched):

  • Import paths that moved: mcp.server.fastmcp -> mcp.server.mcpserver,
    mcp.types -> mcp_types (including the from mcp import types form, which needs a
    whole-statement rewrite), mcp.shared.version -> mcp_types.version.
  • Renamed symbols: FastMCP -> MCPServer, McpError -> MCPError,
    FastMCPError -> MCPServerError, streamablehttp_client -> streamable_http_client,
    and the removed Content / ResourceReference aliases.
  • McpError(ErrorData(code=..., message=...)) to the flat MCPError(...) constructor, and
    e.error.code to e.code inside an except McpError as e: block (only the full
    three-part chain -- a bare e.error may be a whole ErrorData and is never collapsed).
  • camelCase attribute reads on mcp.types models, restricted to the 40 field names v1
    actually declared. This matters: in real v1 code most camelCase attribute accesses are
    logging.getLogger, .basicConfig, or the user's own attributes, so anything broader
    than an allowlist is unusable.
  • The streamable_http_client(...) as (read, write, _) three-tuple to the v2 two-tuple.

What it deliberately only marks (the design rule is: never guess at a change that
depends on information not in the file):

  • The ~20 mcp.types names with no v2 home (Cursor, the TASK_* constants, the
    v1 type-machinery aliases). mcp_types is not a name-superset of v1's mcp.types,
    so each of these is marked with its replacement at the import and at every use --
    the alternative is rewriting them into an import that cannot resolve, silently. A
    test pins, against the installed package, that every public name v1's mcp.types
    defined either exists on mcp_types or is explicitly accounted for, so this list
    cannot rot as v2 evolves.
  • streamablehttp_client(...) used anywhere other than directly as a with item
    (enter_async_context(...) is the common form): its result shape changed and only
    the inline as (read, write, _) form can be rewritten rather than flagged.
  • A positional constructor argument after the server's name (v1's second positional was
    instructions; v2's is title, so renaming the call alone would silently send the
    instructions as the title), and a camelCase name that one of the file's own classes
    declares (renaming its uses would break that class, whose declaration does not change).
  • The lowlevel @server.call_tool() decorators. They are syntactically identical to the
    high-level @mcp.tool() ones -- only what the receiver is bound to distinguishes them --
    and migrating one also means reordering statements and rewriting the handler signature.
    bump-pydantic was eventually archived as "incomplete" largely because it attempted its
    equivalent of this and got it wrong often enough to lose trust; the marker names the
    exact on_*= keyword instead.
  • Transport keywords on the MCPServer constructor (host=, stateless_http=, ...). The
    right destination (run() / sse_app() / streamable_http_app()) depends on how the
    server is started and may be in another file, so the kwarg is left in place -- v2 then
    fails loudly -- rather than deleted, which would silently lose configuration.
  • Every removed API with no drop-in replacement.

How Has This Been Tested?

  • 130 test functions (138 cases), 100% branch coverage on the new package
    (./scripts/test is green for the whole tree), strict pyright, ruff.
  • The failure modes that matter most for a tool like this are rewriting code it should
    not touch and making code worse, so that is what most of the suite pins, with exact
    reproductions: a file that never imports the SDK is never modified even when it spells
    tempting names (a local variable named mcp, getattr(row, "createdAt") on an ORM
    row, self.get_context() in a Django view); nothing is ever rewritten into a silent
    NameError (import mcp plus mcp.types.X is marked, not half-rewritten); nothing
    that works on v2 is broken (e.error.message = ... is a write to a still-mutable
    field and is left alone; only reads collapse to the new properties) or wrongly marked
    (ctx.request_context is a live v2 idiom, so that name is deliberately NOT matched);
    and a re-run of the codemod over its own output is byte-for-byte identical, including
    for a marker on the first statement of a module, which libCST parses back into the
    module header rather than the statement.
  • The two tables that name-match without resolving an import are each pinned against the
    installed v2 surface by construction: a "removed attribute" name may not be spelled by
    any living public v2 API (the request_context lesson, now a test), and every public
    name v1's mcp.types defined must exist on mcp_types or be explicitly accounted for.
  • The mapping tables are pinned against the installed v2 package by ratchet tests, so
    they cannot silently drift as v2 evolves: every rename target is exec'd and must
    resolve, every removed API must be provably absent, and no flagged constructor keyword
    may survive on MCPServer.__init__. That last one is not theoretical -- it exists
    because debug, log_level, and dependencies all looked removed at one point and are
    actually still accepted, and a marker on a keyword that works is a lie the user cannot
    reconcile.
  • Measured against ground truth: the 76 example files that exist on both v1.x and main
    were migrated by hand, so their diff is the correct migration. Running the codemod on
    the v1 side and diffing against the human result (all sides normalized through the
    repo's own ruff config): of the 51 files with a real migration diff, 13 are reproduced
    exactly, 35 partially, and 0 are made worse
    . The lowlevel examples get ~0% help by
    design. Reproduce with scratch/codemod-spike/eval_production.py on this branch.

Breaking Changes

None. The package is additive; mcp does not depend on it.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

Needs an owner before this merges: the PR adds mcp-codemod to the publish
workflow's build step, so the PyPI project must be registered and a trusted publisher
configured for it (the same dance mcp-types needed) before this lands. The
ordering matters: if a release is tagged first, the upload job dies at the unregistered
mcp-codemod wheel after mcp is uploaded and before mcp-types, and mcp's
exact pin on mcp-types makes that half-published release uninstallable until the job
is re-run. (Without the workflow change the documented uvx mcp-codemod fails for every
reader instead, so the docs and the publishing have to land together either way.)

Deliberately not in this PR (each is a clean follow-up, and none should gate the
design review):

  • Flagging an import of a removed module path (the whole mcp.{client,server,shared}.experimental
    Tasks namespace, mcp.client.websocket, mcp.shared.progress). Today a removed
    symbol is marked wherever it appears, but import mcp.shared.progress alone is not.
  • Restructuring docs/migration.md codemod-first into the two-journey split the
    TypeScript guide uses, with the mapping tables linked as the source of truth.
  • A batch-test harness that runs the codemod over real downstream repositories and diffs
    their type-check results, like the TypeScript codemod's batch-test.

AI Disclaimer

A new `mcp-codemod` workspace package (`uvx mcp-codemod v1-to-v2 ./src`)
that rewrites every v1 -> v2 change whose meaning is unambiguous from the
file alone, and inserts a `# mcp-codemod:` comment above every site it
recognized but would not guess at. Built on libCST.

Names are resolved through each file's imports, never matched as text, so
an aliased import or an unrelated symbol that shares a name with an SDK
one is never touched. The camelCase to snake_case rename is restricted to
the field names v1's `mcp.types` actually declared. Anything whose correct
rewrite depends on information that is not in the file -- the lowlevel
decorator to `on_*` relocation, the transport keywords on the `MCPServer`
constructor -- is left exactly as written and marked instead, so the
remaining work is one grep. Re-running on the output is a no-op.

The mapping tables are pinned against the installed v2 package by ratchet
tests so they cannot silently drift: every rename target must resolve,
every removed API must be provably absent, and no flagged constructor
keyword may survive on `MCPServer.__init__`. Measured against the example
files that exist on both `v1.x` and `main` (whose diff is the hand-written
migration), the codemod fully reproduces 13 of the 51 with a real
migration diff, improves 35 more, and makes none worse.

Also adds an "Automated migration" section to docs/migration.md, a mention
of the tool in README.v2.md, and the package to the publish workflow's
build step (the PyPI project and its trusted publisher must exist before a
release is tagged with this in it).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant