Overhaul missing_docs skill: fix audit blind spots, add surface change detection#201
Overhaul missing_docs skill: fix audit blind spots, add surface change detection#201hongyi-chen wants to merge 17 commits into
Conversation
- audit_docs.py: fail loud (exit 2 + audits_skipped) when repos are missing; fix repo auto-detection to siblings of the docs repo root - GA detection via the app/src/features.rs cargo-feature bridge plus RELEASE_FLAGS/PREVIEW_FLAGS/DOGFOOD_FLAGS instead of snake_case guessing - New audits: public API routes (router/handlers/public_api gin groups vs OpenAPI spec), CLI subcommands (clap enum tree, hidden skipped), slash commands (static registry), surface-map hygiene (dead entries) - Staleness: strip code spans, word-boundary matching, skip historical changelog pages (73 -> 29 findings; 'oz agent' CLI noise eliminated) - Change detection: references/surface_snapshot.json + --diff mode reporting added/removed/promoted surfaces and changelog items since last run; --update-snapshot regenerates the baseline - Seed feature_surface_map.md: map handoff/orchestration/queueing/BYOK/ billing/etc. flags, prune 45+ dead entries, add slash-command section and API internal sentinels - SKILL.md: document the exit-code contract, diff workflow, drift-watch mode for the recurring agent (with copyable scheduled-agent prompt), and make surface-map + snapshot updates explicit drafting steps Co-Authored-By: Oz <oz-agent@warp.dev>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…e, stale refs - Settings audit: parse the define_setting! toml_path registry (~200 settings, flag-status aware, object-typed settings handled) and check coverage in the all-settings reference; reverse check catches documented settings that were renamed/removed in code (e.g. agents.oz.* -> agents.warp_agent.*) - Stale doc references: validate documented keybinding actions (scope:action) still exist anywhere in warp-internal source - Docs structure audit: flag pages missing from src/sidebar.ts (with an allowlist section in the surface map) - CLI: recursive subcommand parsing (oz run message send, oz environment image list, ...) plus per-module --flag tracking in the snapshot - API: positional RouterGroup argument resolution at Register* call sites (fixes oauth route prefixes) and param-name-insensitive OpenAPI matching - Snapshot v2: settings, web app routes (AgentsApp.tsx), server-side agent tools (ToolName consts + Create*NativeTool registrations), bundled + channel-gated skills, CLI flags; graceful one-time note when diffing against a v1 snapshot - Changelog cross-check now also tracks 'Oz updates' bullets - Extraction sanity guards: implausibly low parse counts (broken parser after a code-layout change) skip dependent audits and exit 2 instead of silently under-reporting; map hygiene and reverse checks gated on healthy extraction - Feature flag enum parsing is brace-safe (survives future struct variants) - SKILL.md documents the 9 coverage audits, snapshot-only surfaces, and adjacent-skill ownership (validate_ui_refs, sync-error-docs, style_lint, weekly-404-monitor) Co-Authored-By: Oz <oz-agent@warp.dev>
…lags Triple-check that every feature is encapsulated in the mapping: - Built-in completeness accounting on every full run: partitions every extracted surface item (277 flags, 74 CLI commands, 71 API routes, 47 slash commands, 201 settings) into exactly one accountability bucket (mapped / ignored / doc-covered / visible finding / snapshot-tracked) and exits 2 with integrity:accounting if anything escapes — an unaccounted item can only mean the audit logic regressed - Map integrity checks in hygiene: entries in both the mapping and the ignore list (ignore silently wins) and duplicate keys within a section are now medium findings - Ignore-list review against computed statuses found ~20 flags filed under 'Non-GA' that have since GA'd: reclassified 15 user-facing ones to real mappings (session sharing trio, AgentHarness, SshRemoteServer, ArtifactCommand, OzIdentityFederation, image-context pair, OzPlatformSkills, WorkflowAliases, ShellSelector, KittyImages, UndoClosedPanes, RevertDiffHunk), surfaced FullScreenZenMode as a visible undocumented-feature finding, and retitled the section so placement no longer asserts rollout status - Re-baselined the snapshot after live drift the diff caught on today's checkouts (SuperGrok dogfood->ga, new /rename-conversation slash command — both now standing coverage findings) - SKILL.md: documented the accounting contract and the end-to-end 'how every change path is caught' chain (new/promoted/removed surfaces, no-code-change launches via changelog net, parser rot via extraction guards, map rot via hygiene) Co-Authored-By: Oz <oz-agent@warp.dev>
Warp's client code is open source at warpdotdev/warp — the audit now treats the public repo as the primary source: - Repo auto-detection prefers a sibling checkout named 'warp' and falls back to 'warp-internal' for transitional environments - New --warp flag is the primary CLI option; --warp-internal remains as a deprecated alias (same destination) so existing invocations keep working - All docstrings, stderr messages, skip reasons, section headers, and SKILL.md guidance (requirements, audit descriptions, drift-watch command, scheduled-agent prompt) now reference the public warp client repo Validated: auto-detect fallback, preferred 'warp' sibling resolution, explicit --warp, and the deprecated alias all run the full audit with clean completeness accounting. Co-Authored-By: Oz <oz-agent@warp.dev>
Co-Authored-By: Oz <oz-agent@warp.dev>
…uperGrok docs (#203) * First drift-watch burn-down: settings, keybindings, slash commands, SuperGrok Dogfoods the missing_docs drift-watch workflow on the standing findings backlog (41 findings resolved): - all-settings.mdx: documented 17 missing settings (prompt submission mode, orchestration message display, auto handoff on sleep, agent attribution, handoff kill-switches, OSC 52 clipboard access, async find, directory tab colors, vertical-tabs panel options, hidden files in project explorer, line number mode, force X11, Ctrl+Enter submit for CLI agents, input focus on block selection) with types/defaults/options extracted from the settings registry; moved git_operations_autogen_enabled into [agents.warp_agent.active_ai] reflecting the agents.oz -> agents.warp_agent rename and removed the stale remnant section; added an Experimental section - keyboard-shortcuts.mdx: fixed all 14 dead action names (10 renames like workspace:open_new_tab -> workspace:new_tab, editor_view:cmd_i -> editor_view:inspect_command, terminal:trigger_subshell_bootstrap -> terminal:warpify_subshell; 4 removed actions blanked) - slash-commands.mdx: added /environment, /harness, /host (cloud agent session selectors) and /rename-conversation - bring-your-own-api-key.mdx: documented connecting a SuperGrok subscription instead of an xAI API key (newly GA SuperGrok flag) + map entry - Surface map: allowlisted guides/agent-workflows/warp-vs-claude-code as intentionally unlisted (per its frontmatter note) - Re-applied the GA-flag reclassification from the previous session (15 mappings + FullScreenZenMode surfaced) which had been silently reverted by a 'git checkout' during integrity testing before it was committed \u2014 caught by comparing accounting bucket counts run-over-run - Snapshot re-baselined; audit now reports 0 stale doc references, 0 undocumented settings/slash commands/unlisted pages; remaining backlog: 29 CLI subcommand docs, 24 OpenAPI spec gaps (update-open-api-spec), 29 terminology pages (style_lint), FullScreenZenMode + GroupedTabs Co-Authored-By: Oz <oz-agent@warp.dev> * docs: second drift-watch pass — settings, feature flags, map hygiene Re-ran the missing_docs drift-watch audit against current code surfaces and burned down the newly-found in-scope drift (features 5→0, settings 6→0, map hygiene 2→0). Settings (all-settings.mdx): - Document appearance.icon.show_dock_icon (macOS Dock / Cmd-Tab visibility) - Document agents.warp_agent.other.long_running_command_submission_mode - Document code.editor.format_on_save - Document cloud_platform.third_party_api_keys.gemini_enterprise_credentials_enabled - Document warpify.ssh.reuse_existing_control_master - Map warpify.ssh.ssh_tmux_deprecation_notice_pending -> internal (one-time migration banner state, not user-configurable) Feature flags (feature_surface_map.md): - Map CodexPlugin -> cli-agents/codex.md - Map FullScreenZenMode, AsyncFind -> all-settings.mdx (surfaces are documented settings) - Map CustomModelRouters -> inference/model-choice.mdx (new "Custom routers" section) - Ignore GroupedTabs (macOS-only Preview; docs pending GA promotion) Map hygiene: - Prune stale ignore-list flags FreeUserNoAi and WelcomeTab (no longer in code) Co-Authored-By: Oz <oz-agent@warp.dev> * docs(missing_docs): codify finding-resolution patterns in SKILL.md Add a "Resolution patterns" subsection capturing the per-type decision rules applied during the second drift-watch pass, so recurring runs resolve findings consistently: - user-facing setting -> document in all-settings - internal/state-only setting -> map `section.key -> internal` - feature flag with a dedicated page -> map to it - feature flag whose only surface is a documented setting -> map to that page - preview/pre-launch feature with no docs -> ignore-list with a comment - stale map entry/doc reference -> prune after confirming removal in code Co-Authored-By: Oz <oz-agent@warp.dev> * Update .agents/skills/missing_docs/references/feature_surface_map.md Co-authored-by: oz-for-oss[bot] <277970191+oz-for-oss[bot]@users.noreply.github.com> --------- Co-authored-by: Oz <oz-agent@warp.dev> Co-authored-by: oz-for-oss[bot] <277970191+oz-for-oss[bot]@users.noreply.github.com>
Resolve conflicts from main's cloud-agents -> /platform/ restructure (#262) and reconcile independently-added settings docs: - slash-commands.mdx: keep new /environment, /harness, /host, /rename-conversation rows; retarget moved cloud-agents links to /platform/. - all-settings.mdx: keep the corrected [agents.warp_agent.active_ai] home for git_operations_autogen_enabled (verified toml_path in source) and drop main's re-added, incorrect [agents.oz.active_ai] section; retarget handoff/ orchestration links to /platform/. - feature_surface_map.md: take main's /platform/ targets for cloud-agents entries, keep RemoteCodebaseIndexing, and migrate remaining cloud-agents mappings (AgentHarness, OzHandoff, Handoff*, RunAgentsTool, NamedAgents) to /platform/. Audit: map_hygiene 0, unaccounted none, no audits skipped. Build: 340 pages OK. Co-Authored-By: Oz <oz-agent@warp.dev>
|
I'm starting a first review of this pull request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
This PR overhauls the missing_docs skill, adds a generated surface snapshot, updates the curated surface map, and makes related documentation updates for slash commands, settings, BYOK, and model routing.
Concerns
- Phase 3 still instructs agents to update
astro.config.mjsfor sidebar entries even though this repo usessrc/sidebar.ts, so drafted pages can remain unlisted. - The settings reference documents
code.editor.show_hidden_fileswith the wrong default value compared with the client setting definition. - The newly added snapshot is missing settings that this same PR documents, which undermines the
--diffbaseline and the claimed completeness guarantees. - The attached diff marks
.agents/skills/missing_docs/scripts/audit_docs.pyas patch-unavailable, so this review could not inspect or inline-comment the central audit implementation changes.
Verdict
Found: 0 critical, 3 important, 0 suggestions
Request changes
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
| - Bold + dash format for list items: `* **Term** - Description` | ||
| 6. Create the markdown file at the suggested path | ||
| 7. Remind user to add new pages to the relevant `astro.config.mjs (sidebar config)` | ||
| 7. Add new pages to the sidebar config in `astro.config.mjs` |
There was a problem hiding this comment.
src/sidebar.ts (and the new structure audit checks that file), so telling agents to edit astro.config.mjs will leave new pages unlisted.
| 7. Add new pages to the sidebar config in `astro.config.mjs` | |
| 7. Add new pages to the sidebar in `src/sidebar.ts` |
There was a problem hiding this comment.
Fixed in 5863b8c — Phase 3 step 7 now points to src/sidebar.ts (with a note that astro.config.mjs is only needed for a brand-new top-level topic).
| "code.editor.show_global_search": "always_on", | ||
| "code.editor.show_hidden_files": "always_on", | ||
| "code.editor.show_project_explorer": "always_on", | ||
| "code.editor.use_warp_as_default_editor": "always_on", |
There was a problem hiding this comment.
code.editor.format_on_save, even though this PR documents it in all-settings.mdx; regenerate the snapshot after extracting all documented settings, and also verify appearance.icon.show_dock_icon, agents.warp_agent.other.long_running_command_submission_mode, warpify.ssh.reuse_existing_control_master, and cloud_platform.third_party_api_keys.gemini_enterprise_credentials_enabled.
There was a problem hiding this comment.
Fixed in 5863b8c — regenerated surface_snapshot.json; it now includes code.editor.format_on_save plus appearance.icon.show_dock_icon, agents.warp_agent.other.long_running_command_submission_mode, warpify.ssh.reuse_existing_control_master, and cloud_platform.third_party_api_keys.gemini_enterprise_credentials_enabled (and current flag/CLI/API state).
| * `show_code_review_diff_stats` — Whether to show lines added/removed counts on the code review button. Type: boolean. Default: `true`. | ||
| * `show_project_explorer` — Whether the project explorer is shown in the tools panel. Type: boolean. Default: `true`. | ||
| * `show_global_search` — Whether global file search is shown in the tools panel. Type: boolean. Default: `true`. | ||
| * `show_hidden_files` — Whether hidden files (dotfiles) are shown in the project explorer. Type: boolean. Default: `false`. |
There was a problem hiding this comment.
show_hidden_files defaults to true in the client setting definition, so the reference now documents the wrong default.
| * `show_hidden_files` — Whether hidden files (dotfiles) are shown in the project explorer. Type: boolean. Default: `false`. | |
| * `show_hidden_files` — Whether hidden files (dotfiles) are shown in the project explorer. Type: boolean. Default: `true`. |
There was a problem hiding this comment.
Fixed in 5863b8c — corrected the default to true to match the client setting definition (code.editor.show_hidden_files defaults to true).
Add scripts/suggest_reviewers.py, which maps the source surface behind each drift-watch finding to its owning engineer using the CODEOWNERS-format ownership files already maintained in the code repos: - warp client: .github/STAKEHOLDERS - warp-server: .github/STAKEHOLDERS (advisory) + .github/CODEOWNERS (enforced) It resolves with standard last-match-wins precedence, dedupes owners into users/teams, and prints a ready-to-run `gh pr edit --add-reviewer` command. Unresolved paths are non-fatal so a scheduled run is never blocked. Wire it into SKILL.md: a new "Reviewer routing" section (per-category source-file hints), a drift-watch "Route reviewers" step before opening the PR, an updated scheduled-agent prompt, and a References entry. Ownership stays sourced from STAKEHOLDERS/CODEOWNERS (kept fresh by sync-stakeholders) — never hardcoded. Co-Authored-By: Oz <oz-agent@warp.dev>
- SKILL.md: Phase 3 step 7 now points to src/sidebar.ts (the real sidebar source the structure audit checks); astro.config.mjs only for a new top-level topic. - all-settings.mdx: correct code.editor.show_hidden_files default to `true` (matches the client setting definition). - surface_snapshot.json: regenerate so it includes the newly documented settings (code.editor.format_on_save, appearance.icon.show_dock_icon, agents.warp_agent.other.long_running_command_submission_mode, warpify.ssh.reuse_existing_control_master, cloud_platform.third_party_api_keys.gemini_enterprise_credentials_enabled) and current flag/CLI/API state. Audit: exit 0, audits_skipped none, unaccounted none. Co-Authored-By: Oz <oz-agent@warp.dev>
…secret) (#278) Sample output of one missing_docs drift-watch pass over the CLI backlog, to evaluate the skill's efficacy. Resolves 14 of 31 undocumented CLI commands: - Draft: document the `oz api-key list/create/expire` subcommands in the CLI reference (reference/cli/api-keys.mdx), with flags/args extracted from crates/warp_cli/src/api_key.rs. - Map (no duplication): `oz schedule *` and `oz secret *` subcommands are already documented in their feature pages, so add surface-map entries pointing there instead of re-drafting. Reviewer routing (suggest_reviewers.py): crates/warp_cli ownership -> @bnavetta, @ianhodge. Audit after: CLI findings 31 -> 17, exit 0, unaccounted none. Co-authored-by: Oz <oz-agent@warp.dev>
- test_suggest_reviewers.py: 15 unit tests for reviewer resolution (CODEOWNERS matching incl. anchored dir-prefix / exact-file / glob / default rule, last-match-wins precedence, user vs team split, dedup, unresolved paths, warp-internal alias, stdin) — all via temp ownership files. - test_audit_docs.py: 6 behavioral integration tests that run audit_docs.py against the sibling code repos — clean exit + completeness accounting (unaccounted empty), --category scoping, --severity filtering, fail-loud (exit 2) on a missing repo, committed-snapshot currency, and that --update-snapshot honors --snapshot without mutating the committed snapshot. Skips gracefully when the code repos aren't checked out. Both suites use only the Python stdlib (unittest) — no third-party deps. 21/21 pass. Documented under a new "## Tests" section in SKILL.md. Co-Authored-By: Oz <oz-agent@warp.dev>
Make explicit that the skill must only document publicly released surfaces: - New "Public vs. private surfaces" section in SKILL.md: the OSS warp client repo is public; warp-server is a PRIVATE repo whose only public surface is the released Oz Agent API already in the OpenAPI spec. Two gates (source/exposure + GA rollout); never document private or unreleased surfaces. - Woven into the API audit description, Phase 3 API-gap research, and Resolution patterns: warp-server endpoints not in the released spec are not auto- documentable — route released ones via sync-openapi-spec, else `-> internal`/defer. Apply the rule to detected drift: Agent Memory is research preview, so its `oz memory*` / `oz memory-store*` CLI and `/memory_stores/*` REST API are mapped `-> internal` with comments. Added a public/private POLICY note to the surface map's API section. Audit after: CLI 17->15, API 29->18, audits_skipped none, unaccounted none. Co-Authored-By: Oz <oz-agent@warp.dev>
…n CI - test_audit_docs.py: add test_research_preview_surfaces_are_deferred, a regression guard asserting Agent Memory's CLI (`oz memory*`) and REST API (`/memory_stores/*`) are never flagged for documentation (public/private boundary). 22 tests total, all green. - ci.yml: run the skill's stdlib test suites on every PR. The reviewer-resolver unit tests run fully; the audit integration tests skip gracefully since the warp/warp-server code repos aren't checked out in docs CI. - SKILL.md: note the new boundary test in the Tests section. Co-Authored-By: Oz <oz-agent@warp.dev>
Fixes the limitation that the CLI and API audits flagged commands/routes
regardless of whether their feature had shipped (so non-GA surfaces like Agent
Memory needed a permanent `-> internal`).
audit_docs.py:
- New `gated:<Flag>` surface-map target for CLI commands and API routes. The
audit resolves the gating flag's rollout status (the same machinery used for
feature flags + settings):
* non-GA (preview/dogfood/other) -> deferred (new `gated_non_ga` accounting
bucket), not a finding;
* GA -> falls through to normal coverage so it auto-surfaces as a finding;
* unknown flag -> conservative (still a finding) + map-hygiene error so the
annotation can't silently rot.
- audit_cli / audit_api now take flag_statuses; main computes it for API runs.
- Completeness accounting keeps totality (gated_non_ga counted; unaccounted none).
feature_surface_map.md: migrate Agent Memory's `oz memory*` / `oz memory-store*`
CLI and `/memory_stores/*` API from `-> internal` to `-> gated:AIMemories`, so
they auto-surface for docs when AIMemories goes GA. Documented the `gated:`
sentinel in the header.
SKILL.md: document `gated:<Flag>` in Public vs. private surfaces + Resolution
patterns.
tests: add TestGatedLogic (helper, non-GA deferral, GA auto-surface, unknown-flag
conservatism, map-hygiene validation). 27 tests pass; audit exit 0, unaccounted
none.
Co-Authored-By: Oz <oz-agent@warp.dev>
The gated:<Flag> work added a `gated_non_ga` bucket to the CLI and API completeness accounting, but SKILL.md's bucket list still omitted it. Document `gated_non_ga` for CLI commands and API routes, and note the `gated:<Flag>` sentinel alongside `internal` in the References section so the documented accounting matches what the audit actually emits. Co-Authored-By: Oz <oz-agent@warp.dev>
…, oz provider) (#279) Demonstrates the missing_docs drift-watch loop end to end on real drift. The audit flagged 15 undocumented `oz` subcommands; this resolves all of them according to each command's GA rollout status: - GA (NamedAgents): document the `oz agent` named-agent management group (list/get/create/update/delete + `oz agent skills`) and fix the existing `oz agent list` entry, which incorrectly described skill listing. - GA (ConversationApi): document `oz run conversation get` and the `oz run message` inbox commands (list/read/watch/send/mark-delivered). - Non-GA (ProviderCommand = dogfood): defer the whole `oz provider` group via `gated:ProviderCommand` so it auto-surfaces for docs when it goes GA. All command flags drafted from crates/warp_cli source (agent.rs, task.rs, provider.rs). CLI audit now reports 0 gaps; cli_commands gated_non_ga = 14. Co-authored-by: Oz <oz-agent@warp.dev>
The new "Managing named agents" section says agents are run with `oz agent run-cloud --agent <UID>`, but that flag was absent from the run-cloud key-flags list. Document `--agent <UID>` (from RunCloudArgs in crates/warp_cli/src/agent.rs) and cross-link the two sections so readers can get from creating a named agent to running one. Co-Authored-By: Oz <oz-agent@warp.dev>
Why
The recurring docs-coverage agent was missing shipped features. Root causes (all reproduced):
warp-internal/warp-serverundersrc/content/(wrong since the Astro migration). Without explicit flags, every code audit was skipped with only a stderr warning, and the report saidTotal gaps found: 0with exit 0 — a scheduled agent reads that as success.app/Cargo.toml, missing the real cargo-feature→flag bridge inapp/src/features.rs(e.g.am_workflows→AgentModeWorkflows) and ignoringRELEASE_FLAGS/PREVIEW_FLAGSentirely.router/router.go(0 findings = false comfort), CLI subcommands were never parsed, and slash commands, settings, keybindings, the Oz web app, server tools, bundled skills, sidebar structure, and the docs changelog were all unaudited.oz agentmatching legitimate CLI examples.What changed
scripts/audit_docs.py(rewritten: 9 coverage audits + change detection + completeness accounting)audits_skippedwhen a repo is missing, an extraction sanity guard trips (implausibly low parse counts = source layout changed), orintegrity:accountingfires (see below).features.rsbridge +RELEASE_FLAGS; brace-safe enum parsing.hide = trueskipped; per-module--flagtracking.sync-openapi-spec.define_setting!registry vs all-settings reference, flag-status aware, object-typed settings handled), stale doc references (reverse checks: renamed settings likeagents.oz.*→agents.warp_agent.*, dead keybinding actions), docs structure (pages missing fromsrc/sidebar.ts), staleness, and map hygiene.Completeness accounting (the no-slip guarantee)
integrity:accounting+ exit 2 (only possible via audit-logic regression).Change detection (
--diff+ snapshot v2)SuperGrokpromoted dogfood→ga and a new/rename-conversationslash command; both are now standing coverage findings and the snapshot was re-baselined.references/feature_surface_map.mdAgentHarness,SshRemoteServer,ArtifactCommand,OzIdentityFederation, image-context pair,OzPlatformSkills,WorkflowAliases,ShellSelector,KittyImages,UndoClosedPanes,RevertDiffHunk); surfacedFullScreenZenModeas a visible undocumented-GA finding; new slash/settings/unlisted-pages sections + APIinternalsentinels.SKILL.mdvalidate_ui_refs,sync-error-docs,check_for_broken_links/weekly-404-monitor,style_lint).Public repo retarget
warp(falling back towarp-internalfor transitional environments), and--warpis the primary CLI flag with--warp-internalkept as a deprecated alias. All script messages and SKILL.md guidance (including the scheduled-agent prompt) reference the public repo.Validation
py_compileclean; full audit ~1.2s; snapshot regeneration byte-identical; fresh--diffreports 0.FullScreenZenMode,GroupedTabs,SuperGrok), 1 unlisted page, 29 stale-terminology pages.No docs-site content changes, so no site build needed.
Conversation: https://staging.warp.dev/conversation/a706b73b-186b-456c-ad18-28390e36cb89
Run: https://oz.staging.warp.dev/runs/019eb51e-4d35-7d62-a494-331f39efbd67
Plans:
This PR was generated with Oz.