Skip to content

gh-151292: _remote_debugging: Do not corrupt the binary file when hitting OverflowError#152892

Open
maurycy wants to merge 5 commits into
python:mainfrom
maurycy:graceful-binary-overflow
Open

gh-151292: _remote_debugging: Do not corrupt the binary file when hitting OverflowError#152892
maurycy wants to merge 5 commits into
python:mainfrom
maurycy:graceful-binary-overflow

Conversation

@maurycy

@maurycy maurycy commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

The #151292 issue focuses on total_samples:u32 but a big problem is that we're leaving the file corrupted on any OverflowError.

So, for example, someone ran recording for an hour on the production - hit some limit and boom, it's unreadable because the header wasn't finalized:

2026-06-11T02:03:58.920689000+0200 maurycy@gimel /Users/maurycy/src/github.com/maurycy/cpython (vmremap 0bdde7f?) % head -c 128 /tmp/overflow.bin | xxd
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 28b5 2ffd 0058 5c1d 013a d500 1621 8025  (./..X\..:...!.%
00000050: 49d2 0133 3003 331a 3330 3d32 35a9 ccc0  I..30.3.30=25...
00000060: 3aa9 049c b5d6 b69d d296 3122 4488 0c6d  :.........1"D..m
00000070: 0148 0150 014f 0946 32bd b931 318c b6de  .H.P.O.F2..11...

The PR fixes this by catching the exception and stopping the sampler gracefully and letting the export finalize. As a result, the finalizer cannot raise the OverflowError either.

There are many overflow errors:

if (entry->pending_rle_count > UINT32_MAX - writer->total_samples) {
PyErr_SetString(PyExc_OverflowError,
"too many samples for binary format");
return -1;
}

if (writer->total_samples == UINT32_MAX) {
PyErr_SetString(PyExc_OverflowError,
"too many samples for binary format");
return -1;
}

if (writer->string_count >= UINT32_MAX) {
PyErr_SetString(PyExc_OverflowError,
"too many strings for binary format");
return -1;
}

if ((uintmax_t)str_len > UINT32_MAX) {
PyErr_Format(PyExc_OverflowError,
"string length %zd exceeds binary format maximum %u",
str_len, UINT32_MAX);
return -1;

if (writer->frame_count >= UINT32_MAX) {
PyErr_SetString(PyExc_OverflowError,
"too many frames for binary format");
return -1;
}

if (writer->thread_count >= UINT32_MAX) {
PyErr_SetString(PyExc_OverflowError,
"too many threads for binary format");
return NULL;
}

if (interp_id_long > UINT32_MAX) {
PyErr_Format(PyExc_OverflowError,
"interpreter_id %lu exceeds maximum value %lu",
interp_id_long, (unsigned long)UINT32_MAX);
return -1;
}

if (old_cap > SIZE_MAX / 2) {
PyErr_SetString(PyExc_OverflowError, "Array capacity overflow");
return -1;
}
size_t new_cap = old_cap * 2;
if (new_cap > SIZE_MAX / elem_size1 || new_cap > SIZE_MAX / elem_size2) {
PyErr_SetString(PyExc_OverflowError, "Array allocation size overflow");
return -1;

Given the limits (u32 or u64) isn't easy to write a test for it, so I just settled for the interpreter.

self._writer.write_sample(stack_frames, timestamp_us)
try:
self._writer.write_sample(stack_frames, timestamp_us)
except OverflowError as e:

@maurycy maurycy Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pablogsal Truth be told, I'm not 100% sure what's the best layer for handling "finalizable" exceptions like OverflowError.

The promise of "finalizing the file" is in the BinaryWriter:

/*[clinic input]
_remote_debugging.BinaryWriter.__exit__
exc_type: object = None
exc_val: object = None
exc_tb: object = None
Exit context manager, finalizing the file.
[clinic start generated code]*/

This is not the case:

def __exit__(self, exc_type, exc_val, exc_tb):
"""Context manager exit - finalize unless there was an error."""
if exc_type is None:
self._writer.finalize()
else:
self._writer.close()

/* Only finalize on normal exit (no exception) */

Perhaps it's something as simple as weakening exc_type == Py_None, instead of doing a round-trip since the module should contain all the knowledge (ie: which exceptions are finalizable for it.)

This issue presents itself there since we've got both the unwinder and the collector in the bloc :

except (RuntimeError, UnicodeDecodeError, MemoryError, OSError):

...and the meaning of MemoryError, RuntimeError or OSError in the unwinder is non-fatal, but for the binary collector it might mean non-finalizable exception.

Of course - the very proper way to approach this is to maintain the finalizable state is in the writer itself, instead of relying on the exceptions... but that's way too much for this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant