Skip to content

Overhaul missing_docs skill: fix audit blind spots, add surface change detection#201

Open
hongyi-chen wants to merge 17 commits into
mainfrom
oz/missing-docs-skill-overhaul
Open

Overhaul missing_docs skill: fix audit blind spots, add surface change detection#201
hongyi-chen wants to merge 17 commits into
mainfrom
oz/missing-docs-skill-overhaul

Conversation

@hongyi-chen

@hongyi-chen hongyi-chen commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Why

The recurring docs-coverage agent was missing shipped features. Root causes (all reproduced):

  1. Silent no-op runs: repo auto-detection looked for warp-internal/warp-server under src/content/ (wrong since the Astro migration). Without explicit flags, every code audit was skipped with only a stderr warning, and the report said Total gaps found: 0 with exit 0 — a scheduled agent reads that as success.
  2. Wrong GA detection: the script snake_cased flag names against app/Cargo.toml, missing the real cargo-feature→flag bridge in app/src/features.rs (e.g. am_workflowsAgentModeWorkflows) and ignoring RELEASE_FLAGS/PREVIEW_FLAGS entirely.
  3. No change detection: flags deleted after stabilizing vanished from the audit universe silently; nothing tracked changes between runs.
  4. Blind spots: API audit parsed only router/router.go (0 findings = false comfort), CLI subcommands were never parsed, and slash commands, settings, keybindings, the Oz web app, server tools, bundled skills, sidebar structure, and the docs changelog were all unaudited.
  5. Map drift: the surface map was stale (handoff/orchestration/queueing/agents pages existed unmapped; 45+ dead entries; ~20 ignore entries filed as "Non-GA" had GA'd) and SKILL.md never told the drafting phase to update it.
  6. Noise: 73 staleness findings dominated by oz agent matching legitimate CLI examples.

What changed

scripts/audit_docs.py (rewritten: 9 coverage audits + change detection + completeness accounting)

  • Fails loud: exit 2 + audits_skipped when a repo is missing, an extraction sanity guard trips (implausibly low parse counts = source layout changed), or integrity:accounting fires (see below).
  • GA detection via the features.rs bridge + RELEASE_FLAGS; brace-safe enum parsing.
  • CLI: recursive subcommand tree with hide = true skipped; per-module --flag tracking.
  • API: nested gin groups with positional RouterGroup resolution; param-name-insensitive OpenAPI matching; findings point to sync-openapi-spec.
  • Slash commands, Settings (~200-setting define_setting! registry vs all-settings reference, flag-status aware, object-typed settings handled), stale doc references (reverse checks: renamed settings like agents.oz.*agents.warp_agent.*, dead keybinding actions), docs structure (pages missing from src/sidebar.ts), staleness, and map hygiene.

Completeness accounting (the no-slip guarantee)

  • Every full run partitions every extracted surface item into exactly one accountability bucket and embeds the proof in the report. Current accounting: 277 flags (123 mapped + 63 ignored + 3 findings + 88 snapshot-tracked non-GA), 74 CLI commands, 71 API routes, 47 slash commands, 201 settings — 0 unaccounted across all surfaces.
  • Anything escaping every bucket → integrity:accounting + exit 2 (only possible via audit-logic regression).
  • Map integrity findings: entries in both mapping + ignore list, and duplicate keys per section.

Change detection (--diff + snapshot v2)

  • Snapshot tracks flags + status, CLI commands/flags, API routes, slash commands, settings + status, Oz web app routes, 66 server-side agent tools, bundled + channel-gated skills, and the last-seen changelog version ("New features"/"Improvements"/"Oz updates" bullets become verification findings — the net for server-side/experiment launches with no client code change).
  • Proven live: on fresh checkouts a day later, the diff caught SuperGrok promoted dogfood→ga and a new /rename-conversation slash command; both are now standing coverage findings and the snapshot was re-baselined.

references/feature_surface_map.md

  • Seeded mappings for documented features; pruned 45+ dead entries; reclassified 15 misfiled GA flags from ignore → real mappings (session sharing trio, AgentHarness, SshRemoteServer, ArtifactCommand, OzIdentityFederation, image-context pair, OzPlatformSkills, WorkflowAliases, ShellSelector, KittyImages, UndoClosedPanes, RevertDiffHunk); surfaced FullScreenZenMode as a visible undocumented-GA finding; new slash/settings/unlisted-pages sections + API internal sentinels.

SKILL.md

  • Documents the 9 audits, exit-code contract, accounting contract, the end-to-end "how every change path is caught" chain (new/promoted/removed surfaces, no-code-change launches via the changelog net, parser rot via extraction guards, map rot via hygiene), the three enforced map-update paths, drift-watch mode with a copyable scheduled-agent prompt, and adjacent-skill ownership (validate_ui_refs, sync-error-docs, check_for_broken_links/weekly-404-monitor, style_lint).

Public repo retarget

  • The audit now targets the public warpdotdev/warp repo as the primary client-code source: auto-detection prefers a sibling checkout named warp (falling back to warp-internal for transitional environments), and --warp is the primary CLI flag with --warp-internal kept as a deprecated alias. All script messages and SKILL.md guidance (including the scheduled-agent prompt) reference the public repo.

Validation

  • py_compile clean; full audit ~1.2s; snapshot regeneration byte-identical; fresh --diff reports 0.
  • Missing repos → exit 2; wrong-but-existing paths → extraction guards trip and gated audits (map hygiene, reverse checks, accounting) skip rather than emit garbage.
  • Simulated changes across every surface type produce the expected diff findings; simulated audit-logic regression (dropped finding) → accounting reports it unaccounted; injected map conflict + duplicate → hygiene findings fire.
  • External re-verification script independently confirms the partition (totality + disjointness) and that the snapshot exactly matches extraction for all 7 surface lists.
  • Standing findings are all genuine, actionable backlog for the first scheduled run: 29 CLI subcommands, 24 API spec gaps, 18 settings, 15 stale doc references, 4 slash commands, 3 features (FullScreenZenMode, GroupedTabs, SuperGrok), 1 unlisted page, 29 stale-terminology pages.

No docs-site content changes, so no site build needed.

Conversation: https://staging.warp.dev/conversation/a706b73b-186b-456c-ad18-28390e36cb89
Run: https://oz.staging.warp.dev/runs/019eb51e-4d35-7d62-a494-331f39efbd67
Plans:

This PR was generated with Oz.

- audit_docs.py: fail loud (exit 2 + audits_skipped) when repos are missing;
  fix repo auto-detection to siblings of the docs repo root
- GA detection via the app/src/features.rs cargo-feature bridge plus
  RELEASE_FLAGS/PREVIEW_FLAGS/DOGFOOD_FLAGS instead of snake_case guessing
- New audits: public API routes (router/handlers/public_api gin groups vs
  OpenAPI spec), CLI subcommands (clap enum tree, hidden skipped), slash
  commands (static registry), surface-map hygiene (dead entries)
- Staleness: strip code spans, word-boundary matching, skip historical
  changelog pages (73 -> 29 findings; 'oz agent' CLI noise eliminated)
- Change detection: references/surface_snapshot.json + --diff mode reporting
  added/removed/promoted surfaces and changelog items since last run;
  --update-snapshot regenerates the baseline
- Seed feature_surface_map.md: map handoff/orchestration/queueing/BYOK/
  billing/etc. flags, prune 45+ dead entries, add slash-command section and
  API internal sentinels
- SKILL.md: document the exit-code contract, diff workflow, drift-watch mode
  for the recurring agent (with copyable scheduled-agent prompt), and make
  surface-map + snapshot updates explicit drafting steps

Co-Authored-By: Oz <oz-agent@warp.dev>
@cla-bot cla-bot Bot added the cla-signed label Jun 11, 2026
@vercel

vercel Bot commented Jun 11, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Ready Ready Preview, Comment Jun 30, 2026 11:00pm

Request Review

…e, stale refs

- Settings audit: parse the define_setting! toml_path registry (~200 settings,
  flag-status aware, object-typed settings handled) and check coverage in the
  all-settings reference; reverse check catches documented settings that were
  renamed/removed in code (e.g. agents.oz.* -> agents.warp_agent.*)
- Stale doc references: validate documented keybinding actions (scope:action)
  still exist anywhere in warp-internal source
- Docs structure audit: flag pages missing from src/sidebar.ts (with an
  allowlist section in the surface map)
- CLI: recursive subcommand parsing (oz run message send, oz environment
  image list, ...) plus per-module --flag tracking in the snapshot
- API: positional RouterGroup argument resolution at Register* call sites
  (fixes oauth route prefixes) and param-name-insensitive OpenAPI matching
- Snapshot v2: settings, web app routes (AgentsApp.tsx), server-side agent
  tools (ToolName consts + Create*NativeTool registrations), bundled +
  channel-gated skills, CLI flags; graceful one-time note when diffing
  against a v1 snapshot
- Changelog cross-check now also tracks 'Oz updates' bullets
- Extraction sanity guards: implausibly low parse counts (broken parser after
  a code-layout change) skip dependent audits and exit 2 instead of silently
  under-reporting; map hygiene and reverse checks gated on healthy extraction
- Feature flag enum parsing is brace-safe (survives future struct variants)
- SKILL.md documents the 9 coverage audits, snapshot-only surfaces, and
  adjacent-skill ownership (validate_ui_refs, sync-error-docs, style_lint,
  weekly-404-monitor)

Co-Authored-By: Oz <oz-agent@warp.dev>
…lags

Triple-check that every feature is encapsulated in the mapping:

- Built-in completeness accounting on every full run: partitions every
  extracted surface item (277 flags, 74 CLI commands, 71 API routes, 47
  slash commands, 201 settings) into exactly one accountability bucket
  (mapped / ignored / doc-covered / visible finding / snapshot-tracked) and
  exits 2 with integrity:accounting if anything escapes — an unaccounted
  item can only mean the audit logic regressed
- Map integrity checks in hygiene: entries in both the mapping and the
  ignore list (ignore silently wins) and duplicate keys within a section
  are now medium findings
- Ignore-list review against computed statuses found ~20 flags filed under
  'Non-GA' that have since GA'd: reclassified 15 user-facing ones to real
  mappings (session sharing trio, AgentHarness, SshRemoteServer,
  ArtifactCommand, OzIdentityFederation, image-context pair,
  OzPlatformSkills, WorkflowAliases, ShellSelector, KittyImages,
  UndoClosedPanes, RevertDiffHunk), surfaced FullScreenZenMode as a visible
  undocumented-feature finding, and retitled the section so placement no
  longer asserts rollout status
- Re-baselined the snapshot after live drift the diff caught on today's
  checkouts (SuperGrok dogfood->ga, new /rename-conversation slash command
  — both now standing coverage findings)
- SKILL.md: documented the accounting contract and the end-to-end
  'how every change path is caught' chain (new/promoted/removed surfaces,
  no-code-change launches via changelog net, parser rot via extraction
  guards, map rot via hygiene)

Co-Authored-By: Oz <oz-agent@warp.dev>
Warp's client code is open source at warpdotdev/warp — the audit now treats
the public repo as the primary source:

- Repo auto-detection prefers a sibling checkout named 'warp' and falls back
  to 'warp-internal' for transitional environments
- New --warp flag is the primary CLI option; --warp-internal remains as a
  deprecated alias (same destination) so existing invocations keep working
- All docstrings, stderr messages, skip reasons, section headers, and
  SKILL.md guidance (requirements, audit descriptions, drift-watch command,
  scheduled-agent prompt) now reference the public warp client repo

Validated: auto-detect fallback, preferred 'warp' sibling resolution,
explicit --warp, and the deprecated alias all run the full audit with clean
completeness accounting.

Co-Authored-By: Oz <oz-agent@warp.dev>
Co-Authored-By: Oz <oz-agent@warp.dev>
…uperGrok docs (#203)

* First drift-watch burn-down: settings, keybindings, slash commands, SuperGrok

Dogfoods the missing_docs drift-watch workflow on the standing findings
backlog (41 findings resolved):

- all-settings.mdx: documented 17 missing settings (prompt submission mode,
  orchestration message display, auto handoff on sleep, agent attribution,
  handoff kill-switches, OSC 52 clipboard access, async find, directory tab
  colors, vertical-tabs panel options, hidden files in project explorer,
  line number mode, force X11, Ctrl+Enter submit for CLI agents, input
  focus on block selection) with types/defaults/options extracted from the
  settings registry; moved git_operations_autogen_enabled into
  [agents.warp_agent.active_ai] reflecting the agents.oz -> agents.warp_agent
  rename and removed the stale remnant section; added an Experimental section
- keyboard-shortcuts.mdx: fixed all 14 dead action names (10 renames like
  workspace:open_new_tab -> workspace:new_tab, editor_view:cmd_i ->
  editor_view:inspect_command, terminal:trigger_subshell_bootstrap ->
  terminal:warpify_subshell; 4 removed actions blanked)
- slash-commands.mdx: added /environment, /harness, /host (cloud agent
  session selectors) and /rename-conversation
- bring-your-own-api-key.mdx: documented connecting a SuperGrok subscription
  instead of an xAI API key (newly GA SuperGrok flag) + map entry
- Surface map: allowlisted guides/agent-workflows/warp-vs-claude-code as
  intentionally unlisted (per its frontmatter note)
- Re-applied the GA-flag reclassification from the previous session (15
  mappings + FullScreenZenMode surfaced) which had been silently reverted
  by a 'git checkout' during integrity testing before it was committed \u2014
  caught by comparing accounting bucket counts run-over-run
- Snapshot re-baselined; audit now reports 0 stale doc references, 0
  undocumented settings/slash commands/unlisted pages; remaining backlog:
  29 CLI subcommand docs, 24 OpenAPI spec gaps (update-open-api-spec),
  29 terminology pages (style_lint), FullScreenZenMode + GroupedTabs

Co-Authored-By: Oz <oz-agent@warp.dev>

* docs: second drift-watch pass — settings, feature flags, map hygiene

Re-ran the missing_docs drift-watch audit against current code surfaces and
burned down the newly-found in-scope drift (features 5→0, settings 6→0,
map hygiene 2→0).

Settings (all-settings.mdx):
- Document appearance.icon.show_dock_icon (macOS Dock / Cmd-Tab visibility)
- Document agents.warp_agent.other.long_running_command_submission_mode
- Document code.editor.format_on_save
- Document cloud_platform.third_party_api_keys.gemini_enterprise_credentials_enabled
- Document warpify.ssh.reuse_existing_control_master
- Map warpify.ssh.ssh_tmux_deprecation_notice_pending -> internal (one-time
  migration banner state, not user-configurable)

Feature flags (feature_surface_map.md):
- Map CodexPlugin -> cli-agents/codex.md
- Map FullScreenZenMode, AsyncFind -> all-settings.mdx (surfaces are documented settings)
- Map CustomModelRouters -> inference/model-choice.mdx (new "Custom routers" section)
- Ignore GroupedTabs (macOS-only Preview; docs pending GA promotion)

Map hygiene:
- Prune stale ignore-list flags FreeUserNoAi and WelcomeTab (no longer in code)

Co-Authored-By: Oz <oz-agent@warp.dev>

* docs(missing_docs): codify finding-resolution patterns in SKILL.md

Add a "Resolution patterns" subsection capturing the per-type decision rules
applied during the second drift-watch pass, so recurring runs resolve findings
consistently:
- user-facing setting -> document in all-settings
- internal/state-only setting -> map `section.key -> internal`
- feature flag with a dedicated page -> map to it
- feature flag whose only surface is a documented setting -> map to that page
- preview/pre-launch feature with no docs -> ignore-list with a comment
- stale map entry/doc reference -> prune after confirming removal in code

Co-Authored-By: Oz <oz-agent@warp.dev>

* Update .agents/skills/missing_docs/references/feature_surface_map.md

Co-authored-by: oz-for-oss[bot] <277970191+oz-for-oss[bot]@users.noreply.github.com>

---------

Co-authored-by: Oz <oz-agent@warp.dev>
Co-authored-by: oz-for-oss[bot] <277970191+oz-for-oss[bot]@users.noreply.github.com>
Resolve conflicts from main's cloud-agents -> /platform/ restructure (#262)
and reconcile independently-added settings docs:

- slash-commands.mdx: keep new /environment, /harness, /host,
  /rename-conversation rows; retarget moved cloud-agents links to /platform/.
- all-settings.mdx: keep the corrected [agents.warp_agent.active_ai] home for
  git_operations_autogen_enabled (verified toml_path in source) and drop main's
  re-added, incorrect [agents.oz.active_ai] section; retarget handoff/
  orchestration links to /platform/.
- feature_surface_map.md: take main's /platform/ targets for cloud-agents
  entries, keep RemoteCodebaseIndexing, and migrate remaining cloud-agents
  mappings (AgentHarness, OzHandoff, Handoff*, RunAgentsTool, NamedAgents) to
  /platform/.

Audit: map_hygiene 0, unaccounted none, no audits skipped. Build: 340 pages OK.

Co-Authored-By: Oz <oz-agent@warp.dev>
@hongyi-chen hongyi-chen marked this pull request as ready for review June 30, 2026 21:32
@oz-for-oss

oz-for-oss Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

@hongyi-chen

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

@oz-for-oss oz-for-oss Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR overhauls the missing_docs skill, adds a generated surface snapshot, updates the curated surface map, and makes related documentation updates for slash commands, settings, BYOK, and model routing.

Concerns

  • Phase 3 still instructs agents to update astro.config.mjs for sidebar entries even though this repo uses src/sidebar.ts, so drafted pages can remain unlisted.
  • The settings reference documents code.editor.show_hidden_files with the wrong default value compared with the client setting definition.
  • The newly added snapshot is missing settings that this same PR documents, which undermines the --diff baseline and the claimed completeness guarantees.
  • The attached diff marks .agents/skills/missing_docs/scripts/audit_docs.py as patch-unavailable, so this review could not inspect or inline-comment the central audit implementation changes.

Verdict

Found: 0 critical, 3 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread .agents/skills/missing_docs/SKILL.md Outdated
- Bold + dash format for list items: `* **Term** - Description`
6. Create the markdown file at the suggested path
7. Remind user to add new pages to the relevant `astro.config.mjs (sidebar config)`
7. Add new pages to the sidebar config in `astro.config.mjs`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] The docs repo sidebar is defined in src/sidebar.ts (and the new structure audit checks that file), so telling agents to edit astro.config.mjs will leave new pages unlisted.

Suggested change
7. Add new pages to the sidebar config in `astro.config.mjs`
7. Add new pages to the sidebar in `src/sidebar.ts`

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5863b8c — Phase 3 step 7 now points to src/sidebar.ts (with a note that astro.config.mjs is only needed for a brand-new top-level topic).

"code.editor.show_global_search": "always_on",
"code.editor.show_hidden_files": "always_on",
"code.editor.show_project_explorer": "always_on",
"code.editor.use_warp_as_default_editor": "always_on",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] The snapshot omits code.editor.format_on_save, even though this PR documents it in all-settings.mdx; regenerate the snapshot after extracting all documented settings, and also verify appearance.icon.show_dock_icon, agents.warp_agent.other.long_running_command_submission_mode, warpify.ssh.reuse_existing_control_master, and cloud_platform.third_party_api_keys.gemini_enterprise_credentials_enabled.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5863b8c — regenerated surface_snapshot.json; it now includes code.editor.format_on_save plus appearance.icon.show_dock_icon, agents.warp_agent.other.long_running_command_submission_mode, warpify.ssh.reuse_existing_control_master, and cloud_platform.third_party_api_keys.gemini_enterprise_credentials_enabled (and current flag/CLI/API state).

* `show_code_review_diff_stats` — Whether to show lines added/removed counts on the code review button. Type: boolean. Default: `true`.
* `show_project_explorer` — Whether the project explorer is shown in the tools panel. Type: boolean. Default: `true`.
* `show_global_search` — Whether global file search is shown in the tools panel. Type: boolean. Default: `true`.
* `show_hidden_files` — Whether hidden files (dotfiles) are shown in the project explorer. Type: boolean. Default: `false`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] show_hidden_files defaults to true in the client setting definition, so the reference now documents the wrong default.

Suggested change
* `show_hidden_files` — Whether hidden files (dotfiles) are shown in the project explorer. Type: boolean. Default: `false`.
* `show_hidden_files` — Whether hidden files (dotfiles) are shown in the project explorer. Type: boolean. Default: `true`.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5863b8c — corrected the default to true to match the client setting definition (code.editor.show_hidden_files defaults to true).

hongyi-chen and others added 2 commits June 30, 2026 21:43
Add scripts/suggest_reviewers.py, which maps the source surface behind each
drift-watch finding to its owning engineer using the CODEOWNERS-format ownership
files already maintained in the code repos:
- warp client: .github/STAKEHOLDERS
- warp-server: .github/STAKEHOLDERS (advisory) + .github/CODEOWNERS (enforced)

It resolves with standard last-match-wins precedence, dedupes owners into
users/teams, and prints a ready-to-run `gh pr edit --add-reviewer` command.
Unresolved paths are non-fatal so a scheduled run is never blocked.

Wire it into SKILL.md: a new "Reviewer routing" section (per-category source-file
hints), a drift-watch "Route reviewers" step before opening the PR, an updated
scheduled-agent prompt, and a References entry. Ownership stays sourced from
STAKEHOLDERS/CODEOWNERS (kept fresh by sync-stakeholders) — never hardcoded.

Co-Authored-By: Oz <oz-agent@warp.dev>
- SKILL.md: Phase 3 step 7 now points to src/sidebar.ts (the real sidebar
  source the structure audit checks); astro.config.mjs only for a new top-level topic.
- all-settings.mdx: correct code.editor.show_hidden_files default to `true`
  (matches the client setting definition).
- surface_snapshot.json: regenerate so it includes the newly documented settings
  (code.editor.format_on_save, appearance.icon.show_dock_icon,
  agents.warp_agent.other.long_running_command_submission_mode,
  warpify.ssh.reuse_existing_control_master,
  cloud_platform.third_party_api_keys.gemini_enterprise_credentials_enabled) and
  current flag/CLI/API state.

Audit: exit 0, audits_skipped none, unaccounted none.

Co-Authored-By: Oz <oz-agent@warp.dev>
hongyi-chen and others added 2 commits June 30, 2026 14:55
…secret) (#278)

Sample output of one missing_docs drift-watch pass over the CLI backlog, to
evaluate the skill's efficacy. Resolves 14 of 31 undocumented CLI commands:

- Draft: document the `oz api-key list/create/expire` subcommands in the CLI
  reference (reference/cli/api-keys.mdx), with flags/args extracted from
  crates/warp_cli/src/api_key.rs.
- Map (no duplication): `oz schedule *` and `oz secret *` subcommands are already
  documented in their feature pages, so add surface-map entries pointing there
  instead of re-drafting.

Reviewer routing (suggest_reviewers.py): crates/warp_cli ownership -> @bnavetta,
@ianhodge.

Audit after: CLI findings 31 -> 17, exit 0, unaccounted none.

Co-authored-by: Oz <oz-agent@warp.dev>
- test_suggest_reviewers.py: 15 unit tests for reviewer resolution (CODEOWNERS
  matching incl. anchored dir-prefix / exact-file / glob / default rule,
  last-match-wins precedence, user vs team split, dedup, unresolved paths,
  warp-internal alias, stdin) — all via temp ownership files.
- test_audit_docs.py: 6 behavioral integration tests that run audit_docs.py
  against the sibling code repos — clean exit + completeness accounting
  (unaccounted empty), --category scoping, --severity filtering, fail-loud
  (exit 2) on a missing repo, committed-snapshot currency, and that
  --update-snapshot honors --snapshot without mutating the committed snapshot.
  Skips gracefully when the code repos aren't checked out.

Both suites use only the Python stdlib (unittest) — no third-party deps. 21/21
pass. Documented under a new "## Tests" section in SKILL.md.

Co-Authored-By: Oz <oz-agent@warp.dev>
Make explicit that the skill must only document publicly released surfaces:
- New "Public vs. private surfaces" section in SKILL.md: the OSS warp client repo
  is public; warp-server is a PRIVATE repo whose only public surface is the
  released Oz Agent API already in the OpenAPI spec. Two gates (source/exposure +
  GA rollout); never document private or unreleased surfaces.
- Woven into the API audit description, Phase 3 API-gap research, and Resolution
  patterns: warp-server endpoints not in the released spec are not auto-
  documentable — route released ones via sync-openapi-spec, else `-> internal`/defer.

Apply the rule to detected drift: Agent Memory is research preview, so its
`oz memory*` / `oz memory-store*` CLI and `/memory_stores/*` REST API are mapped
`-> internal` with comments. Added a public/private POLICY note to the surface
map's API section.

Audit after: CLI 17->15, API 29->18, audits_skipped none, unaccounted none.

Co-Authored-By: Oz <oz-agent@warp.dev>
…n CI

- test_audit_docs.py: add test_research_preview_surfaces_are_deferred, a
  regression guard asserting Agent Memory's CLI (`oz memory*`) and REST API
  (`/memory_stores/*`) are never flagged for documentation (public/private
  boundary). 22 tests total, all green.
- ci.yml: run the skill's stdlib test suites on every PR. The reviewer-resolver
  unit tests run fully; the audit integration tests skip gracefully since the
  warp/warp-server code repos aren't checked out in docs CI.
- SKILL.md: note the new boundary test in the Tests section.

Co-Authored-By: Oz <oz-agent@warp.dev>
Fixes the limitation that the CLI and API audits flagged commands/routes
regardless of whether their feature had shipped (so non-GA surfaces like Agent
Memory needed a permanent `-> internal`).

audit_docs.py:
- New `gated:<Flag>` surface-map target for CLI commands and API routes. The
  audit resolves the gating flag's rollout status (the same machinery used for
  feature flags + settings):
  * non-GA (preview/dogfood/other) -> deferred (new `gated_non_ga` accounting
    bucket), not a finding;
  * GA -> falls through to normal coverage so it auto-surfaces as a finding;
  * unknown flag -> conservative (still a finding) + map-hygiene error so the
    annotation can't silently rot.
- audit_cli / audit_api now take flag_statuses; main computes it for API runs.
- Completeness accounting keeps totality (gated_non_ga counted; unaccounted none).

feature_surface_map.md: migrate Agent Memory's `oz memory*` / `oz memory-store*`
CLI and `/memory_stores/*` API from `-> internal` to `-> gated:AIMemories`, so
they auto-surface for docs when AIMemories goes GA. Documented the `gated:`
sentinel in the header.

SKILL.md: document `gated:<Flag>` in Public vs. private surfaces + Resolution
patterns.

tests: add TestGatedLogic (helper, non-GA deferral, GA auto-surface, unknown-flag
conservatism, map-hygiene validation). 27 tests pass; audit exit 0, unaccounted
none.

Co-Authored-By: Oz <oz-agent@warp.dev>
The gated:<Flag> work added a `gated_non_ga` bucket to the CLI and API
completeness accounting, but SKILL.md's bucket list still omitted it.
Document `gated_non_ga` for CLI commands and API routes, and note the
`gated:<Flag>` sentinel alongside `internal` in the References section so
the documented accounting matches what the audit actually emits.

Co-Authored-By: Oz <oz-agent@warp.dev>
…, oz provider) (#279)

Demonstrates the missing_docs drift-watch loop end to end on real drift.
The audit flagged 15 undocumented `oz` subcommands; this resolves all of
them according to each command's GA rollout status:

- GA (NamedAgents): document the `oz agent` named-agent management group
  (list/get/create/update/delete + `oz agent skills`) and fix the existing
  `oz agent list` entry, which incorrectly described skill listing.
- GA (ConversationApi): document `oz run conversation get` and the
  `oz run message` inbox commands (list/read/watch/send/mark-delivered).
- Non-GA (ProviderCommand = dogfood): defer the whole `oz provider` group
  via `gated:ProviderCommand` so it auto-surfaces for docs when it goes GA.

All command flags drafted from crates/warp_cli source (agent.rs, task.rs,
provider.rs). CLI audit now reports 0 gaps; cli_commands gated_non_ga = 14.

Co-authored-by: Oz <oz-agent@warp.dev>
The new "Managing named agents" section says agents are run with
`oz agent run-cloud --agent <UID>`, but that flag was absent from the
run-cloud key-flags list. Document `--agent <UID>` (from RunCloudArgs in
crates/warp_cli/src/agent.rs) and cross-link the two sections so readers
can get from creating a named agent to running one.

Co-Authored-By: Oz <oz-agent@warp.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants