Skip to content

fix(ci): retry prisma generate to survive Windows EPERM engine-DLL race#4155

Closed
d-cs wants to merge 1 commit into
mainfrom
fix/ci-windows-prisma-generate-eperm
Closed

fix(ci): retry prisma generate to survive Windows EPERM engine-DLL race#4155
d-cs wants to merge 1 commit into
mainfrom
fix/ci-windows-prisma-generate-eperm

Conversation

@d-cs

@d-cs d-cs commented Jul 5, 2026

Copy link
Copy Markdown
Collaborator

Problem

The e2e / CLI v3 tests (windows - npm) job intermittently dies during dependency setup at the @trigger.dev/database prisma generate step, before any test runs:

@trigger.dev/database:generate: Error: EPERM: operation not permitted, rename
  '...\node_modules\.pnpm\prisma@6.14.0_...\node_modules\prisma\query_engine-windows.dll.node.tmp5744'
  -> '...\node_modules\.pnpm\prisma@6.14.0_...\node_modules\prisma\query_engine-windows.dll.node'
Failed: @trigger.dev/database#generate

Root cause

Confirmed a structural concurrency race, not a random transient. @trigger.dev/database and @internal/run-ops-database pin the same prisma version, so pnpm gives them a single shared package instance in the store. turbo run generate runs their generate scripts concurrently, and both prisma generate processes race to write the shared query_engine-windows.dll.node. On Windows the loser fails with EPERM on the .tmp -> .dll.node rename because the file was just written/held by the winner.

The CI logs show both :generate tasks executing at the same timestamp against the identical shared store path — one finishes, the other EPERMs. In the run that surfaced this, the Windows-pnpm job passed while Windows-npm failed on the same commit, which is the signature of a timing-dependent race.

Fix

Wrap prisma generate in a small cross-platform node script that retries on transient filesystem contention (EPERM/EBUSY/EACCES) with a short backoff. On the retry the engine binary is already present and unlocked, so it succeeds. Non-transient errors (e.g. schema errors) fail fast without retrying.

  • Fixing it in the two package generate scripts covers every invocation path (all pnpm run generate CI jobs plus local dev) without editing many workflow files.
  • On non-Windows the first attempt always succeeds, so it is a zero-cost no-op — no serialization, no slowdown to Linux CI.

Validation

  • Verified locally on macOS: the wrapper runs prisma generate successfully (exit 0) and does not retry on hard errors.
  • The added comment on the e2e generate step brings this PR into the CLI-v3 path filter so the Windows job actually runs here and exercises the fix on the real runner.

🤖 Generated with Claude Code

`@trigger.dev/database` and `@internal/run-ops-database` pin the same prisma
version, so pnpm gives them one shared package instance in the store. When
`turbo run generate` runs their generate scripts concurrently, both prisma
processes race to write the query-engine binary into that shared directory and
on Windows the loser fails with `EPERM: operation not permitted, rename` on the
`.tmp -> .dll.node` step, killing CI before any test runs.

Wrap `prisma generate` in a small cross-platform node script that retries on
transient filesystem contention (EPERM/EBUSY/EACCES) with backoff. On
non-Windows the first attempt always succeeds, so this is a zero-cost no-op.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@changeset-bot

changeset-bot Bot commented Jul 5, 2026

Copy link
Copy Markdown

⚠️ No Changeset found

Latest commit: 4782427

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai

coderabbitai Bot commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

This change adds a new Node.js script, scripts/retry-prisma-generate.mjs, that wraps prisma generate with retry logic to handle transient filesystem contention (e.g., a Windows EPERM race on the query-engine binary). It retries up to 5 attempts with increasing delay when retryable errors are detected in output, exiting on success or after exhausting retries. The generate scripts in internal-packages/database/package.json and internal-packages/run-ops-database/package.json now invoke this wrapper instead of calling prisma generate directly. The E2E workflow file gains explanatory comments describing the shared engine binary behavior and the retry mechanism.

Changes

Cohort / File(s) Summary
Retry script Added scripts/retry-prisma-generate.mjs implementing retry logic around prisma generate
Package scripts Updated generate scripts in internal-packages/database/package.json and internal-packages/run-ops-database/package.json to call the retry wrapper
Workflow docs Added explanatory comments in .github/workflows/e2e.yml near the Prisma client generation step

Sequence Diagram(s)

sequenceDiagram
  participant PackageScript as generate script
  participant Wrapper as retry-prisma-generate.mjs
  participant Prisma as prisma generate

  PackageScript->>Wrapper: node ../../scripts/retry-prisma-generate.mjs
  loop up to 5 attempts
    Wrapper->>Prisma: spawnSync
    Prisma-->>Wrapper: stdout/stderr/status
    alt success
      Wrapper-->>PackageScript: exit(0)
    else transient error
      Wrapper->>Wrapper: sleep with backoff
    else fatal error
      Wrapper-->>PackageScript: exit(non-zero)
    end
  end
Loading

Related issues: None specified.

Related PRs: None specified.

Suggested labels: ci, tooling, database

Suggested reviewers: None specified.


🐰 A gremlin locked the engine tight,
Windows sighed with EPERM spite,
So a script now waits and tries again,
Counting attempts, one to ten (well, five, then),
Prisma generates — success at last, delight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description explains the change, but it does not follow the required template and is missing several required sections. Add the template sections: Closes #issue, Checklist, Testing, Changelog, and Screenshots, and fill in any required items.
✅ Passed checks (4 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title clearly summarizes the main change: retrying Prisma generate to avoid a Windows EPERM race.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/ci-windows-prisma-generate-eperm

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2


ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 3c6bf90a-9670-450e-9759-a4647daabc10

📥 Commits

Reviewing files that changed from the base of the PR and between d977691 and 4782427.

📒 Files selected for processing (4)
  • .github/workflows/e2e.yml
  • internal-packages/database/package.json
  • internal-packages/run-ops-database/package.json
  • scripts/retry-prisma-generate.mjs
📜 Review details
⏰ Context from checks skipped due to timeout. (27)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (9, 10)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (5, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (12, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (3, 12)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (7, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (4, 10)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (9, 12)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (6, 10)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (6, 12)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (8, 10)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (8, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (10, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (4, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (11, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (2, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (7, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (1, 12)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (2, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (10, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (3, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (1, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (5, 10)
  • GitHub Check: e2e / 🧪 CLI v3 tests (blacksmith-4vcpu-windows-2025 - npm)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: e2e / 🧪 CLI v3 tests (blacksmith-4vcpu-ubuntu-2404 - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (blacksmith-4vcpu-windows-2025 - pnpm)
  • GitHub Check: e2e-webapp / 🧪 E2E Tests: Webapp
🧰 Additional context used
📓 Path-based instructions (1)
**/package.json

📄 CodeRabbit inference engine (AGENTS.md)

If adding zod to any package, pin it to the exact repository-wide version (3.25.76) and never use a different version or a range.

Files:

  • internal-packages/run-ops-database/package.json
  • internal-packages/database/package.json
🔇 Additional comments (4)
scripts/retry-prisma-generate.mjs (1)

1-14: LGTM!

Also applies to: 18-21, 39-46, 53-53

internal-packages/database/package.json (1)

18-18: LGTM!

internal-packages/run-ops-database/package.json (1)

28-28: LGTM!

.github/workflows/e2e.yml (1)

45-46: LGTM!

const TRANSIENT =
/\b(EPERM|EBUSY|EACCES)\b|operation not permitted|resource busy or locked|being used by another process/i;

const passthroughArgs = process.argv.slice(2);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Does Node.js require shell:true to spawn .cmd binaries like prisma.cmd on Windows without EINVAL?

💡 Result:

Yes, Node.js now requires { shell: true } to spawn.bat or.cmd files on Windows [1][2]. Following a security update (CVE-2024-27980) implemented in newer versions of Node.js, attempting to use child_process.spawn or child_process.spawnSync to execute.bat or.cmd files (like prisma.cmd) directly without the shell option will result in an EINVAL error [1][3][2]. Because these file types are not directly executable on Windows and require a terminal environment to interpret them, Node.js previously allowed them to be spawned indirectly, but this behavior was restricted to mitigate security vulnerabilities [4][1][2]. When you set { shell: true }, Node.js spawns a shell (using the environment's ComSpec, typically cmd.exe) to handle the execution of the file [4][5][6]. Alternative approaches include using child_process.exec, which internally uses a shell, or manually spawning cmd.exe and passing the target file as an argument [4][5][6]. Keep in mind that using shell: true can introduce security risks if you pass unsanitized user input to the command [4][7].

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n== scripts/retry-prisma-generate.mjs ==\n'
cat -n scripts/retry-prisma-generate.mjs

printf '\n== surrounding lines in e2e.yml ==\n'
if [ -f .github/workflows/e2e.yml ]; then
  rg -n "node|prisma|retry-prisma-generate|shell: true" .github/workflows/e2e.yml
  cat -n .github/workflows/e2e.yml | sed -n '1,220p'
else
  echo "e2e.yml not found"
fi

printf '\n== find prisma references ==\n'
rg -n "retry-prisma-generate|spawnSync\\(|shell: true|prisma generate|prisma\\.cmd|process\\.argv\\.slice\\(2\\)" -S .

Repository: triggerdotdev/trigger.dev

Length of output: 10369


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
from pathlib import Path

for path in [
    Path("scripts/retry-prisma-generate.mjs"),
    Path(".github/workflows/e2e.yml"),
    Path("package.json"),
]:
    if path.exists():
        print(f"\n== {path} ==")
        text = path.read_text()
        lines = text.splitlines()
        for i, line in enumerate(lines, 1):
            if path.name == "retry-prisma-generate.mjs" or "prisma" in line.lower() or "node" in line.lower() or "shell: true" in line:
                print(f"{i}: {line}")
    else:
        print(f"\n== {path} not found ==")
PY

Repository: triggerdotdev/trigger.dev

Length of output: 3431


🌐 Web query:

Node.js child_process.spawnSync shell:true args array are arguments escaped or concatenated; DEP0190 behavior

💡 Result:

When using child_process.spawnSync (or other child_process methods like spawn and execFile) with shell: true, the provided args array is concatenated with spaces rather than being escaped [1][2][3]. The behavior is as follows: 1. Concatenation: Node.js joins the elements of the args array into a single string, separated by spaces [1][2]. 2. Lack of Escaping: The individual arguments are not quoted or escaped [2][3]. This lack of automatic escaping creates a security vulnerability known as shell injection, as a malicious argument could contain characters that alter the command's execution in the shell [2][3]. DEP0190: DEP0190 is a runtime DeprecationWarning introduced in recent Node.js versions (e.g., v23.11.0, v22.15.0) [4] specifically to warn developers about the security risks associated with passing an args array when the shell option is enabled [2][3]. Because the arguments are concatenated without escaping, this pattern provides a false impression of safety that does not exist [3]. To address the deprecation, the recommended approach is to stop passing an args array when using shell: true [3]. Instead, developers should construct the full command string themselves, ensuring that all arguments are properly escaped according to the requirements of the target shell [3]. If necessary, the warning can be temporarily suppressed using the --disable-warning=DEP0190 Node.js flag [3].

Citations:


🌐 Web query:

Node.js spawnSync .cmd Windows shell:true required EINVAL docs child_process

💡 Result:

If you encounter an EINVAL error when using child_process.spawn or child_process.spawnSync on Windows to execute.bat or.cmd files, it is due to a security update (CVE-2024-27980) introduced in April 2024 [1][2]. This update mandates that spawning batch or command files now requires the shell: true option to be explicitly set [1][3]. To resolve this issue, include { shell: true } in the options argument of your spawn or spawnSync function call [1][3]. Example: const { spawnSync } = require('child_process'); spawnSync('npm.cmd', ['--version'], { shell: true, stdio: 'inherit' }); Background: Prior to this change, Node.js allowed these files to be spawned directly. However, the security patch addresses vulnerabilities related to how these files were executed on Windows [1][2]. Because tools like npm, pnpm, and yarn often resolve to.cmd shims on Windows, this breaking change frequently impacts scripts or build tools that invoke these package managers [2][3]. It is strongly advised to use the shell: true option rather than reverting the security fix using --security-revert=CVE-2024-27980, as the latter leaves your application vulnerable [3].

Citations:


🌐 Web query:

site:nodejs.org child_process shell true args concatenated not escaped DEP0190

💡 Result:

DEP0190 refers to the deprecation of passing arguments (the args array) to Node.js child_process functions (such as spawn or execFile) when the shell option is set to true [1][2]. When shell is enabled (shell: true), Node.js runs the command inside a shell [3]. Historically, if an args array was provided alongside this option, the arguments were concatenated without proper escaping or quoting before being passed to the shell [1][3]. This behavior creates a significant security vulnerability: because the arguments are not escaped, any unsanitized user input containing shell metacharacters can be manipulated to trigger arbitrary command execution (shell injection) [3][4]. Due to these security risks, Node.js has deprecated this usage [1][3]. Developers are advised to avoid passing the args array when shell: true [3][4]. If shell execution is necessary, the command and its arguments should be constructed carefully as a single string (if using exec) or, preferably, the shell option should be avoided entirely in favor of spawning commands directly without a shell, which is more secure [3][4]. If shell: true is absolutely required, you must ensure that no unsanitized user input is included in the command string [3].

Citations:


Scope shell: true to Windows only. Node concatenates the args array when shell: true is set, so passthroughArgs is shell-parsed instead of passed through as argv. If the shell is only needed for .cmd resolution on Windows, gate it on process.platform === "win32".

Comment on lines +22 to +51
let lastStatus = 1;

for (let attempt = 1; attempt <= MAX_ATTEMPTS; attempt++) {
const result = spawnSync("prisma", ["generate", ...passthroughArgs], {
shell: true,
encoding: "utf8",
});

process.stdout.write(result.stdout ?? "");
process.stderr.write(result.stderr ?? "");

if (result.status === 0) {
process.exit(0);
}

lastStatus = result.status ?? 1;

const output = `${result.stdout ?? ""}${result.stderr ?? ""}`;
const isRetryable = TRANSIENT.test(output);

if (!isRetryable || attempt === MAX_ATTEMPTS) {
break;
}

const delay = BASE_DELAY_MS * attempt;
console.error(
`prisma generate hit a transient filesystem error (attempt ${attempt}/${MAX_ATTEMPTS}); retrying in ${delay}ms...`
);
sleepSync(delay);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Spawn failures are silently swallowed — surface result.error.

If spawnSync fails to launch the process at all (e.g. prisma not resolvable on PATH, ENOENT), result.status is null, result.stdout/result.stderr are null, and the TRANSIENT regex never matches empty output. The loop then breaks after the very first attempt and exits with lastStatus (falls back to 1) but never prints result.error.message, leaving no diagnostic in CI logs for what is otherwise a straightforward, fully-diagnosable failure.

🐛 Proposed fix to surface spawn errors
   process.stdout.write(result.stdout ?? "");
   process.stderr.write(result.stderr ?? "");
 
+  if (result.error) {
+    console.error(`prisma generate failed to start: ${result.error.message}`);
+  }
+
   if (result.status === 0) {
     process.exit(0);
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let lastStatus = 1;
for (let attempt = 1; attempt <= MAX_ATTEMPTS; attempt++) {
const result = spawnSync("prisma", ["generate", ...passthroughArgs], {
shell: true,
encoding: "utf8",
});
process.stdout.write(result.stdout ?? "");
process.stderr.write(result.stderr ?? "");
if (result.status === 0) {
process.exit(0);
}
lastStatus = result.status ?? 1;
const output = `${result.stdout ?? ""}${result.stderr ?? ""}`;
const isRetryable = TRANSIENT.test(output);
if (!isRetryable || attempt === MAX_ATTEMPTS) {
break;
}
const delay = BASE_DELAY_MS * attempt;
console.error(
`prisma generate hit a transient filesystem error (attempt ${attempt}/${MAX_ATTEMPTS}); retrying in ${delay}ms...`
);
sleepSync(delay);
}
let lastStatus = 1;
for (let attempt = 1; attempt <= MAX_ATTEMPTS; attempt++) {
const result = spawnSync("prisma", ["generate", ...passthroughArgs], {
shell: true,
encoding: "utf8",
});
process.stdout.write(result.stdout ?? "");
process.stderr.write(result.stderr ?? "");
if (result.error) {
console.error(`prisma generate failed to start: ${result.error.message}`);
}
if (result.status === 0) {
process.exit(0);
}
lastStatus = result.status ?? 1;
const output = `${result.stdout ?? ""}${result.stderr ?? ""}`;
const isRetryable = TRANSIENT.test(output);
if (!isRetryable || attempt === MAX_ATTEMPTS) {
break;
}
const delay = BASE_DELAY_MS * attempt;
console.error(
`prisma generate hit a transient filesystem error (attempt ${attempt}/${MAX_ATTEMPTS}); retrying in ${delay}ms...`
);
sleepSync(delay);
}

@d-cs d-cs force-pushed the fix/ci-windows-prisma-generate-eperm branch from 42457f1 to 4782427 Compare July 5, 2026 00:40
@d-cs

d-cs commented Jul 5, 2026

Copy link
Copy Markdown
Collaborator Author

Folded into #4154 (cherry-picked ba796ecfd) — keeping the Windows CI hardening in the run-ops PR. Closing this standalone PR.

@d-cs d-cs closed this Jul 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant