Skip to content

feat(mem): CBM_MAX_MEMORY_MB explicit memory-budget override (#580)#586

Closed
yangsec888 wants to merge 1 commit into
DeusData:mainfrom
yangsec888:feat/max-memory-budget-580
Closed

feat(mem): CBM_MAX_MEMORY_MB explicit memory-budget override (#580)#586
yangsec888 wants to merge 1 commit into
DeusData:mainfrom
yangsec888:feat/max-memory-budget-580

Conversation

@yangsec888

Copy link
Copy Markdown
Contributor

Summary

Adds a CBM_MAX_MEMORY_MB env var that overrides the ram_fraction × total_RAM memory budget with an explicit cap in MiB. Closes the --max-memory request in #580 and gives #563 a workaround on hosts without cgroup limits (macOS / bare metal).

Motivation

The knowledge graph is built fully in memory and the budget is sized to 50% of detected RAM. Two gaps remain even after cgroup-aware detection landed:

  1. No cgroup on the host (macOS in Feature request: --max-memory flag for RAM-constrained hosts #580/Index failed in large repo #563, Windows in Memory leak: process grows to 50+ GB virtual memory over hours/days, crashes Windows #581) → budget is 50% of physical RAM, which can still swap-thrash an 8 GB machine.
  2. Wanting a budget below the cgroup limit. In a container the parent process and any sibling/subprocesses share the cgroup; 0.5 × cgroup can still be too high. There was no way to pin a lower, explicit budget.

CBM_MAX_MEMORY_MB lets the operator set the in-memory budget directly, with the same precedence shape as the existing CBM_WORKERS override: explicit override > implicit detection.

Changes

  • src/foundation/mem.c
    • New pure helper cbm_mem_resolve_budget(total_ram, ram_fraction, max_memory_mb) — single source of truth for the budget number (no globals/env), testable.
    • cbm_mem_init reads CBM_MAX_MEMORY_MB via cbm_safe_getenv, applies it, logs source=env|ram_fraction on the mem.init line, and warns on mem.max.invalid (non-numeric/≤0) / mem.max.clamped (above total RAM).
  • src/foundation/mem.h — declare cbm_mem_resolve_budget; document the env var.
  • README.md — add CBM_MAX_MEMORY_MB to the env-var table.
  • tests/test_mem.c — 6 cases over the pure resolver.

Semantics

CBM_MAX_MEMORY_MB Resulting budget
unset / empty ram_fraction × total_ram (unchanged default)
positive integer ≤ total exactly that many MiB
positive integer > total clamped to total_ram (+ mem.max.clamped warn)
0, negative, non-numeric ignored, fall back to fraction (+ mem.max.invalid warn)

mem.init now logs the chosen source, e.g.:

level=info msg=mem.init budget_mb=5632 total_ram_mb=12288 source=env

Design notes

  • The override is clamped to total_ram when known so it can never claim more than physical/cgroup RAM, but is honored as-is when detection returns 0 (so it also rescues hosts where RAM detection fails).
  • Budget math is extracted into a pure function because cbm_mem_init is one-shot per process (atomic CAS) and can't be re-exercised in-process.

Testing

  • tests/test_mem.c resolver cases.
  • Verified standalone against the compiled cbm_mem_resolve_budget (12/12 checks pass).
  • clang-format --dry-run --Werror clean on all changed files.
  • Manual: CBM_MAX_MEMORY_MB=512 codebase-memory-mcp …mem.init … source=env.

Add a CBM_MAX_MEMORY_MB env var that overrides the ram_fraction × total_RAM
budget with an explicit cap in MiB. Lets RAM-constrained hosts (no cgroup) cap
the in-memory graph, and lets containers pin a budget below the cgroup limit so
headroom is left for sibling processes. Same precedence as CBM_WORKERS:
explicit override > implicit detection.

Budget math is extracted into a pure, testable cbm_mem_resolve_budget() since
cbm_mem_init is one-shot per process. Logs source=env|ram_fraction on mem.init
and warns on invalid/clamped values.

Closes DeusData#580.

Signed-off-by: Sam Li <yangsec888@gmail.com>
@DeusData

Copy link
Copy Markdown
Owner

Huge thanks for opening this PR and for the work you put into it.

The maintainer shop is currently full, so this may sit for a bit before it gets a proper review. We will come back to this as soon as possible with real feedback; I wanted to make sure it did not sit unacknowledged in the meantime.

@DeusData DeusData added enhancement New feature or request stability/performance Server crashes, OOM, hangs, high CPU/memory priority/high Needs near-term maintainer attention; high-impact bug, regression, safety issue, or release blocker. labels Jun 29, 2026
@DeusData

DeusData commented Jul 4, 2026

Copy link
Copy Markdown
Owner

Quick status note: this PR is one of four open memory/RAM-policy changes (#833, #752, #586, #685) that we've reviewed individually and found genuinely complementary — so rather than merging them piecemeal, we're doing a combined design pass over the whole memory policy (explicit override, host-tiered defaults, retention bounds, post-index release, and the Windows auto-sync driver in #841) and will respond here with a concrete direction shortly. Your work is very much part of that plan — thanks for your patience!

@DeusData

DeusData commented Jul 4, 2026

Copy link
Copy Markdown
Owner

Thank you for this — the engineering was genuinely good (a pure, testable budget resolver with clamp-to-total-RAM and source logging). The reason we're closing it: main already has an equivalent explicit override, CBM_MEM_BUDGET_MB (checked first in cbm_mem_init, before any RAM-fraction logic, so it wins over the tiered default). Adding CBM_MAX_MEMORY_MB alongside it would give two env vars for one purpose with subtly different clamp semantics. If you'd like, your clamp + source= logging ideas would be a welcome small hardening of the existing CBM_MEM_BUDGET_MB path (extract its inline logic into your pure resolver, add the clamp + a README row it currently lacks) — happy to review that. Closing in favor of the existing knob; thanks for the solid work.

@DeusData DeusData closed this Jul 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request priority/high Needs near-term maintainer attention; high-impact bug, regression, safety issue, or release blocker. stability/performance Server crashes, OOM, hangs, high CPU/memory

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants