fix(hyperframes-media): npx spawn without shell:true fails silently on Windows#1845
Conversation
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
Reviewed at 1895158926 (batch review, Group A TTS/audio; layering on Magi's own vetting).
Note: bot-authored (Magi via miguel-heygen); COMMENT-quality — stamp routing per protocol.
Summary — Scope shell: true to npx on win32 in spawnP (lib/tts.mjs) so npx.cmd can be exec'd from Node; extract platform + spawnFn as injectable params (defaulting to process.platform / real spawn) so the win32 branch is unit-testable without patching ESM's non-configurable exports.
Scoping to cmd === "npx" && platform === "win32" is exactly right — python3 and ffmpeg (real binaries in this file) don't need shell and stay unchanged. Injectable params for testability is a clean pattern and the test file exercises all three branches (win32+npx, darwin+npx, win32+non-npx).
Concerns
- 🟠
lib/tts.mjs:88-95—shell: trueon Windows meansargsis joined into a single command string and passed tocmd.exe. ThesynthesizeOnekokoro path at line 141 passeswriteTmpText(text)(a file path, not user text — safe) andvoiceId/wavRel/langasargs.writeTmpTextwrites to a temp path Node controls,voiceIdis derived fromresolveVoiceId(Kokoro's list is a fixed enum invoice.mjs),wavRelisassets/voice/${id}.wavwhereidis line id from the request JSON.idandlangare read from a request JSON that the user can craft (audio_request.jsonataudio.mjs:52). If someone were ever to pipe request JSON from a less-trusted source (a bot, a customer template), a craftedidcontaining"& malicious.exe"would land in the shell command as--output assets/voice/<id>.wavand cmd.exe would happily split-execute. This ishyperframes-mediaon a dev laptop today so the blast radius is nil — but the request-JSON is the sort of thing that historically drifts toward automated callers. Consider either:
(i) documenting "audio_request.json is trusted input; do not pipe from untrusted sources" inaudio.mjs's header comment, OR
(ii) validatingid/langagainst a strict[A-Za-z0-9_-]+pattern before they land in TTS args.
Not blocking — the current threat model doesn't include untrusted requests — but the safety property is now load-bearing on that assumption in a way it wasn't before. - 🟠 No Windows CI coverage — the test file mocks
platform="win32"via injected param, which validates that the code branches correctly, but doesn't validate thatshell: true+ npx.cmd actually resolves on a real Windows machine.spawnFnis fake in the tests, so a real regression in thenpx.cmd-vs-shell:trueinteraction across Node versions wouldn't be caught. The PR fixes a bug that only manifests on Windows via a mechanism that only tests on non-Windows CI — the confidence is entirely in the mechanism being straightforward, which it is. Flag: 🟠 rather than 🔴 because the mechanism is well-understood, but suggest linking one of the reporter's manual-verification runs (or a follow-up PR adding Windows CI) in the description.
Questions
- ↩️
spawnPis now exported (previously module-private). Is it OK to export fromlib/tts.mjs? Grepping the branch, only the test file imports it — no other consumer in the repo, but the export widens the module's public surface. Fine, just confirming intent. - ↩️ The
spawnFnparam defaults tospawnfromnode:child_process(top-of-file import). If a future caller wants to override for a different reason (dry-run, logging), the plumbing is now there but not documented. Minor — the JSDoc-style comment abovespawnPexplains it's for testability specifically. OK to leave.
Coordination note — Both #1877 and this PR touch lib/tts.mjs. Different functions, different regions of the file — no functional conflict, but both bump skills-manifest.json hyperframes-media hash and file count. Whichever merges second needs a trivial hash/count rebase on the manifest. Suggest coordinating on merge order (both are safe to land independently otherwise).
— Rames D Jusso
8daf685 to
f81f6ac
Compare
f81f6ac to
5233527
Compare
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
R2 verification — reviewed at 5233527d.
Summary — Magi pushed 5233527d ("fix(hyperframes-media): avoid shell true for windows npx") to address R1's two findings: (1) shell:true shell-injection surface, and (2) no Windows CI so fix unverified on target platform.
Verdict — 🟢 Resolved. Better than the ack described: no shell:true AND no cmd.exe at all.
Verification details
shell:truefully gone. Grep ofskills/hyperframes-media/scripts/lib/tts.mjsat HEAD returns noshell:sink anywhere — only doc comments referencing why it's avoided.resolveSpawnCommandreturnsopts: { stdio: "ignore", windowsHide: true, ...opts }— no shell option (tts.mjs:124).- Fix shape: on
platform === "win32" && cmd === "npx", invokesnode <resolved-npx-cli.js> <...args>directly via spawn (tts.mjs:120-125).resolveNpxCliFromNpmExecPathat :96-103 resolvesnpx-cli.jsnext tonpm-cli.js(the JS entry the .cmd shim would have called). Since we spawnnode(a real binary) with argv, there is NO shell parser in the call chain — no&,|,^,"-quoting concerns. Args from request JSON pass through as pure argv data. - No metachar rejection needed: the sink is
child_process.spawnin argv mode (notexecFileor shell string), so metacharacters have no meaning — no rejection required. This is a stronger property than metachar-rejection would provide. - Fallback safety at :103, :122: if
npm_execpathenv is missing ornpx-cli.jsdoesn't exist,resolveSpawnCommandreturnsnullandspawnPresolves{status: -1}— same shape as the current failure mode, no crash. - Windows CI verification:
Tests on windows-latest+Render on windows-latestPASS at5233527d(Windows render verification workflow). This addresses R1's second finding about unverified Windows behavior. - CodeQL: top-level check SUCCESS at
5233527d. The three PR-review comments fromgithub-advanced-security[bot]on line 145 (Uncontrolled/Indirect command line, Shell command built from env values) are audit-informational — they flag that spawn args include env-derived values (npm_execpath→npx-cli.jspath), but there is no shell sink to exploit. Alert-class isinformational, noterror; CodeQL check remains green. - Tests at
tts.spawn.test.mjs:0-97: 5 cases — resolves npx-cli path, routes npx via node+npx-cli on win32 without shell, preserves shell metacharacters as argv data (explicit"hello & calc"case), does not shell for npx on darwin/linux, does not shell for non-npx on win32. Covers the R1 concern surface.
Residual notes (not blocking)
- Skills-manifest.json hash bump conflicts with #1877 and #1862 (all three change
hyperframes-media.hashfrom096b2ab0a43b05dd). Sequential rebase required at merge time; second and third mergers will rebase-and-regen the manifest.
— Rames D Jusso
jrusso1020
left a comment
There was a problem hiding this comment.
APPROVE — verified CI green (0 fail / 0 pending) + no open CR at this head. Non-author stamp clearing the review gate on the Magi self-initiated draft-pass batch, which James greenlit and RDJ batch-cleared: both security holds re-verified at R2 (#1866 chrome-shell reclaim-race closed via the reclaim-gate + mtime recheck; #1845 Windows npx shell-injection closed — no cmd.exe, node <npx-cli.js> with pure argv), zero drift on the other nine, all green.
Merge via Magi's normal path (no admin-merge). Ordering note: the skills-manifest triple-conflict on #1877 / #1862 / #1845 (all bump hyperframes-media.hash) needs sequential rebase + regen at merge time; the rest land in any order.
— Rames Jusso
5233527 to
b682f36
Compare
jrusso1020
left a comment
There was a problem hiding this comment.
APPROVE — verified CI green (0 fail / 0 pending) + no open CR at this head. Non-author stamp clearing the review gate on the Magi self-initiated draft-pass batch, which James greenlit and RDJ batch-cleared: both security holds re-verified at R2 (#1866 chrome-shell reclaim-race closed via the reclaim-gate + mtime recheck; #1845 Windows npx shell-injection closed — no cmd.exe, node <npx-cli.js> with pure argv), zero drift on the other nine, all green.
Merge via Magi's normal path (no admin-merge). Ordering note: the skills-manifest triple-conflict on #1877 / #1862 / #1845 (all bump hyperframes-media.hash) needs sequential rebase + regen at merge time; the rest land in any order.
— Rames Jusso
…n Windows
Two independent user reports of Kokoro TTS silently failing on Windows,
both naming the same site: lib/tts.mjs synthesizeOne() spawns "npx" via
plain spawn(cmd, args). On Windows npx resolves to npx.cmd, which Node's
spawn() cannot exec without shell:true — it fails ENOENT, and spawnP's
"error" listener turns that into a plain ok:false ("TTS failed") with no
indication of the real cause.
Scope the fix to the npx call specifically (python3/ffmpeg are real
binaries and don't need it), with the platform/spawn function injectable
so the win32 branch is testable without mocking node:child_process (its
ESM exports are non-configurable, so mock.method can't patch it) or the
real process.platform.
b682f36 to
59e33f6
Compare
jrusso1020
left a comment
There was a problem hiding this comment.
APPROVE — verified CI green (0 fail / 0 pending) + no open CR at this head. Non-author stamp clearing the review gate on the Magi self-initiated draft-pass batch, which James greenlit and RDJ batch-cleared: both security holds re-verified at R2 (#1866 chrome-shell reclaim-race closed via the reclaim-gate + mtime recheck; #1845 Windows npx shell-injection closed — no cmd.exe, node <npx-cli.js> with pure argv), zero drift on the other nine, all green.
Merge via Magi's normal path (no admin-merge). Ordering note: the skills-manifest triple-conflict on #1877 / #1862 / #1845 (all bump hyperframes-media.hash) needs sequential rebase + regen at merge time; the rest land in any order.
— Rames Jusso
Root-caused from two independent post-release feedback reports naming the same site (Kokoro TTS silently failing on Windows).
Root cause
lib/tts.mjssynthesizeOne()spawnsnpxvia plainspawn(cmd, args). On Windows,npxresolves tonpx.cmd— a shell script Node'sspawn()cannot exec directly withoutshell: true. It fails ENOENT, andspawnP'serrorlistener silently turns that intook:false("TTS failed"), with no indication of the real cause. Confirmed by two independent reports:Fix
Scope
shell: trueto the npx invocation specifically (cmd === "npx" && platform === "win32") —python3/ffmpegare real binaries and don't need it, so their behavior is unchanged.platformandspawnFnare injectable params (defaultprocess.platform/ the realspawn) so the win32 branch is unit-testable without mockingnode:child_process(its ESM exports are non-configurable —mock.methodcan't patch them) or the realprocess.platform.Tests
tts.spawn.test.mjs— 3 cases: shell enabled for npx on win32, disabled for npx on darwin/linux, disabled for non-npx commands even on win32. Picked up automatically by the existingskills/**/*.test.mjsCI convention.