Skip to content

feat(kubernetes): support PVC subPath driver config#2034

Draft
mjamiv wants to merge 2 commits into
NVIDIA:mainfrom
mjamiv:codex/kubernetes-pvc-subpath
Draft

feat(kubernetes): support PVC subPath driver config#2034
mjamiv wants to merge 2 commits into
NVIDIA:mainfrom
mjamiv:codex/kubernetes-pvc-subpath

Conversation

@mjamiv

@mjamiv mjamiv commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Summary

Add Kubernetes driver-config support for mounting existing PVCs into the agent container with optional sub_path values. This gives deployments a narrow driver-owned storage topology for durable per-user PVC data without broad PodSpec overrides.

Related Issue

Fixes #2033

Changes

  • Adds driver_config.kubernetes.volumes[] for PVC-backed pod volumes.
  • Adds driver_config.kubernetes.containers.agent.volume_mounts[] for agent PVC mounts with optional sub_path.
  • Defaults user-supplied PVC volumes and mounts to read-only unless read_only: false is explicit.
  • Validates duplicate names, unknown volume references, protected mount paths, invalid sub_path values, and attempts to weaken read-only PVC volumes.
  • Skips the default /sandbox workspace PVC injection when an explicit Kubernetes driver-config mount targets a path below /sandbox/.
  • Documents the schema and a single-PVC multi-subPath example in the Kubernetes driver README and compute-driver reference docs.

Testing

  • mise run pre-commit passes
    • Not completed: this environment initially lacked mise; after installing it, mise run pre-commit refused to run until the repo config was trusted. I did not run mise trust because it persistently changes local trust settings.
    • Ran direct underlying checks instead:
      • cargo fmt --all -- --check
      • cargo test -p openshell-driver-kubernetes
      • cargo clippy -p openshell-driver-kubernetes --all-targets -- -D warnings
      • npx --yes markdownlint-cli2@0.22.0 crates/openshell-driver-kubernetes/README.md docs/reference/sandbox-compute-drivers.mdx
      • UV_CACHE_DIR=.cache/uv uv run python scripts/update_license_headers.py --check
  • Unit tests added/updated
    • Added positive validation coverage for a read-write PVC with multiple read-write subPath mounts.
  • E2E tests added/updated (if applicable)
    • Not run locally; a k3d or Kubernetes cluster proof with a pre-existing PVC should verify actual subPath mounts and pod recreation behavior.

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)
    • Published docs and the Kubernetes driver README are updated; no top-level architecture doc was needed for this driver-local config addition.

Signed-off-by: mjamiv <michael.commack@gmail.com>
@copy-pr-bot

copy-pr-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@austin-shih

Copy link
Copy Markdown

Thanks for this — the caller-owned-PVC-by-claim_name + subPath model is exactly what disposable-pod / durable-per-user-storage deployments need. A few test-scope observations from the perspective of someone planning to run this pattern:

  1. Positive validation coverage. The render test exercises a writable PVC, but it goes through sandbox_to_k8s_spec, which doesn't run validate(). The from_template tests are all rejection cases, so there's no happy-path assertion that a writable config is actually accepted. A test that from_template returns Ok for a read-write PVC with several RW subPath mounts (the real per-user shape: workspace, memory, sessions) would lock that path in.

  2. Missing-PVC behavior is untested. When a referenced claim_name doesn't exist, behavior currently falls to the kubelet's FailedMount. Worth covering (ideally in e2e) so the failure surfaces clearly rather than as a stuck pod.

  3. E2E is where the core claims live. The unit tests prove the pod template renders; they can't prove that a pre-existing PVC actually mounts at the subPaths, that the agent can read/write, that the default /sandbox injection is correctly skipped, and — most importantly — that data persists across pod recreation against the same PVC. The PR notes e2e wasn't run; a k3d-based e2e (pre-create PVC → create sandbox with this driver_config → verify mounts + RW → recreate referencing the same PVC → data persists) would de-risk the durability story.

Happy to help validate — we're deploying this exact pattern and can share e2e findings.

Signed-off-by: mjamiv <michael.commack@gmail.com>
@mjamiv

mjamiv commented Jun 27, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the test-scope review. I added positive from_template validation coverage for a read-write PVC with multiple read-write sub_path mounts (workspace, memory, sessions) and also normalized the example names in the docs/tests to keep the contribution generic.

I agree the missing-PVC and durability claims need cluster-level coverage: kubelet FailedMount, actual subPath mount behavior, read/write checks, default /sandbox injection being skipped, and data surviving pod recreation all belong in k3d or Kubernetes e2e rather than unit tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(kubernetes): support PVC subPath mounts via driver_config

2 participants