gh-151292: _remote_debugging: Do not corrupt the binary file when hitting OverflowError#152892
gh-151292: _remote_debugging: Do not corrupt the binary file when hitting OverflowError#152892maurycy wants to merge 5 commits into
_remote_debugging: Do not corrupt the binary file when hitting OverflowError#152892Conversation
| self._writer.write_sample(stack_frames, timestamp_us) | ||
| try: | ||
| self._writer.write_sample(stack_frames, timestamp_us) | ||
| except OverflowError as e: |
There was a problem hiding this comment.
@pablogsal Truth be told, I'm not 100% sure what's the best layer for handling "finalizable" exceptions like OverflowError.
The promise of "finalizing the file" is in the BinaryWriter:
cpython/Modules/_remote_debugging/module.c
Lines 1857 to 1864 in 3428959
This is not the case:
cpython/Lib/profiling/sampling/binary_collector.py
Lines 115 to 120 in 3428959
cpython/Modules/_remote_debugging/module.c
Line 1874 in 3428959
Perhaps it's something as simple as weakening exc_type == Py_None, instead of doing a round-trip since the module should contain all the knowledge (ie: which exceptions are finalizable for it.)
This issue presents itself there since we've got both the unwinder and the collector in the bloc :
cpython/Lib/profiling/sampling/sample.py
Line 180 in 3428959
...and the meaning of MemoryError, RuntimeError or OSError in the unwinder is non-fatal, but for the binary collector it might mean non-finalizable exception.
Of course - the very proper way to approach this is to maintain the finalizable state is in the writer itself, instead of relying on the exceptions... but that's way too much for this PR.
The #151292 issue focuses on
total_samples:u32but a big problem is that we're leaving the file corrupted on anyOverflowError.So, for example, someone ran recording for an hour on the production - hit some limit and boom, it's unreadable because the header wasn't finalized:
The PR fixes this by catching the exception and stopping the sampler gracefully and letting the
exportfinalize. As a result, the finalizer cannot raise theOverflowErroreither.There are many overflow errors:
cpython/Modules/_remote_debugging/binary_io_writer.c
Lines 632 to 636 in f8514dc
cpython/Modules/_remote_debugging/binary_io_writer.c
Lines 681 to 685 in f8514dc
cpython/Modules/_remote_debugging/binary_io_writer.c
Lines 377 to 381 in f8514dc
cpython/Modules/_remote_debugging/binary_io_writer.c
Lines 396 to 400 in f8514dc
cpython/Modules/_remote_debugging/binary_io_writer.c
Lines 444 to 448 in f8514dc
cpython/Modules/_remote_debugging/binary_io_writer.c
Lines 493 to 497 in f8514dc
cpython/Modules/_remote_debugging/binary_io_writer.c
Lines 1054 to 1059 in f8514dc
cpython/Modules/_remote_debugging/binary_io_writer.c
Lines 55 to 63 in f8514dc
Given the limits (
u32oru64) isn't easy to write a test for it, so I just settled for the interpreter._remote_debugging: binary format onlytotal_samples:u32#151292