Python Debug Tools¶
Spoiler: Python 3.6 and newer provide a much better debugging experience!
See also Debug CPython with gdb.
Command to enable debug tools:
- Python 3.7 and newer:
python3 -X dev
- Python 3.6 and newer:
PYTHONMALLOC=debug python3 -Wd -X faulthandler
Get a traceback on a crash¶
Example of crash,
import ctypes def bug(): ctypes.string_at(0) bug()
$ python3 crash.py Segmentation fault (core dumped)
… not very helpful :-( Enable faulthandler to get the Python traceback where
the crash occurred.
python3 -X dev (Python 3.7 and newer) enables
faulthandler provides a traceback on a crash:
$ python3 -X faulthandler crash.py Fatal Python error: Segmentation fault Current thread 0x00007f3bb3998700 (most recent call first): File "/usr/lib64/python3.6/ctypes/__init__.py", line 487 in string_at File "crash.py", line 4 in bug File "crash.py", line 6 in <module> Segmentation fault (core dumped)
To debug a deadlock,
faulthandler.dump_traceback_later() can be implemented
to implement a “watchdog”: dump the traceback where Python is stuck if Python
main code is blocked for longer than N seconds, and exit Python.
See also: Crash reporting in desktop Python applications by Nikhil Marathe and Max Bélanger (November 2018).
Example which doesn’t close explicitly a file:
def func(): f = open(__file__) f = None func()
Output (or lack of output):
$ python3 filebug.py
… ResourceWarning warnings are hidden by default:
$ python3.6 -c 'import pprint, warnings; pprint.pprint(warnings.filters)' [('ignore', None, <class 'DeprecationWarning'>, None, 0), ... ('ignore', None, <class 'ResourceWarning'>, None, 0)]
python3 -X dev (Python 3.7 and newer) or
python3 -Wd (Python 3.6
and older) to display
$ python3 -Wd filebug.py filebug.py:3: ResourceWarning: unclosed file <_io.TextIOWrapper name='filebug.py' mode='r' encoding='UTF-8'> f = None
On Python 3.6 and newer, enabling tracemalloc shows where the resource (file in this example) has been created:
$ python3 -Wd -X tracemalloc=5 filebug.py filebug.py:3: ResourceWarning: unclosed file <_io.TextIOWrapper name='filebug.py' mode='r' encoding='UTF-8'> f = None Object allocated at (most recent call first): File "filebug.py", lineno 2 f = open(__file__) File "filebug.py", lineno 5 func()
Debug memory errors¶
Memory managment in C is complex and error-prone.
Python has multiple allocators which are more or less compatible, but not
always. For example, PyMem_Malloc() uses
malloc() in Python 3.5 and older, but
pymalloc in Python 3.6 and newer.
Releasing memory allocated by
worked until Python 3.5, but “can” crash on Python 3.6 (depending if the memory
block is longer than 512 bytes or not…).
Since Python 3.6, the new PYTHONMALLOC environment variable allows to change the memory allocator at runtime (when starting Python).
PYTHONMALLOC=debug enables Python builtin memory debugger:
python3 -X dev (Python 3.7 and newer) enables automatically
import _testcapi def main(): _testcapi.pymem_buffer_overflow() main()
$ PYTHONMALLOC=debug ./python membug.py Debug memory block at address p=0x7f7c0ed9f160: API 'm' 16 bytes originally requested The 7 pad bytes at p-7 are FORBIDDENBYTE, as expected. The 8 pad bytes at tail=0x7f7c0ed9f170 are not all FORBIDDENBYTE (0xfb): at tail+0: 0x78 *** OUCH at tail+1: 0xfb at tail+2: 0xfb at tail+3: 0xfb at tail+4: 0xfb at tail+5: 0xfb at tail+6: 0xfb at tail+7: 0xfb The block was made by call #28431 to debug malloc/realloc. Data at p: cb cb cb cb cb cb cb cb cb cb cb cb cb cb cb cb Fatal Python error: bad trailing pad byte Current thread 0x00007f7c0ee875c0 (most recent call first): File "membug.py", line 4 in main File "membug.py", line 6 in <module> Aborted (core dumped)
Python dumps the current traceback where the bug has been allocated, but it can be “too late”.
On Python 3.6 and newer, enabling tracemalloc allows to find where the memory block has been allocated which can help to investigate the bug (truncated output to highlight the difference):
$ PYTHONMALLOC=debug ./python -X tracemalloc=5 membug.py (...) Memory block allocated at (most recent call first): File "membug.py", line 4 File "membug.py", line 6 (...)
Traceback with source code recreated manually:
Memory block allocated at (most recent call first): File "membug.py", line 4 _testcapi.pymem_buffer_overflow() File "membug.py", line 6 main()
On this artificial example, the current Python traceback and memory block allocation traceback are the same, but usually they are different.
Sadly, on Python 3.5 and older, the only way to get the Python builtin memory
allocator is to recompile Python (ex: using
which changes the ABI…).
PYTHONMALLOC=malloc valgrind python3 script.py can also be used to debug
C extensions which use directly
malloc()/free(), and not
Use suppression file which can be found in Misc/valgrind.suppr
You might want to call these functions in a running process from gdb:
- _PyUnicode_Dump(obj): dump properties of the Unicode object, not it’s content
- PyErr_Occurred(): get the current exception, NULL if no exception has been raised
- if py-bt command is broken, try to call:
_Py_DumpTracebackThreads(2, interp, tstate)where
- Python 3.8: get
2is the file descriptor 2:
Write core dumps on the current directory:
$ ulimit -c unlimited $ sudo bash -c 'echo "coredump-%e.%p" > /proc/sys/kernel/core_pattern'
Check that it works:
$ ./python -c 'import _testcapi, signal; _testcapi.raise_signal(signal.SIGABRT)' Aborted (core dumped) $ ls coredump* coredump-python.23861 $ gdb ./python -c coredump-python.23861 GNU gdb (GDB) Fedora 8.0.1-36.fc27 (...) Core was generated by `./python -c import _testcapi, signal; _testcapi.raise_signal(signal.SIGABRT)'. Program terminated with signal SIGABRT, Aborted. (gdb) where #0 0x00007fb0cb3ad050 in raise () from /lib64/libpthread.so.0 #1 0x00007fb0c3a53006 in test_raise_signal (self=<module at remote 0x7fb0cb624758>,
Ok, Python crashes generate coredump files and gdb is able to load them.
Put a breakpoint:
- hit ‘m’, search ‘test_api’ to open glance.tests.unit.test_api
The tracemalloc module traces Python memory allocations. It can be used to find memory leaks, or just to have an accurate measure of the memory allocated by Python.
- Write a scenario to reproduce the memory leak. The ideal is a scenario taking only a few minutes
- Enable tracemalloc and replay the scenario
- Take regulary tracemalloc snapshots
- Compare snapshots
If your application only uses Python memory allocators, tracemalloc must show your the exact memory usage counting every single bytes.
If a C extensions uses other memory allocators like
is unable to trace these allocations.
If the application allocates a lot of memory to process some data (memory peak) and then releases almost all memory, except a few small objects, the memory may become fragmented. For example, the application only uses 20 MB whereas the operating system see 24 or 30 MB.
See pytracemalloc: backport to Python 2.7 (need to patch and compile Python manually).
Debug crash in garbage collection (visit_decref)¶
It’s really hard to investigate such crash. Usually a crash in the GC is only the symptom that something corrupted a Python object, and the crash can occur very late after the object has been corrupted.
You might attempt:
- Try python3 -X dev.
- Try Python compiled in debug mode.
- Try a more recent Python version. Maybe it’s a bug in Python which is already fixed?
- List third party C extensions and look first at them. Usually, if you are the only one to see a crash, it comes from your bug. Maybe a C extension doesn’t update reference counters correctly.
- Change GC thresholds:
gc.set_threshold(5). See [Python-Dev] Idea: reduce GC threshold in development mode (-X dev).
- Disable completely the GC:
gc.disable(). It helped the reporter of bpo-2546 to find his bug.
Python issues related to visit_decref():
- 2019-10-07: Python segfaults when configured with –with-pydebug –with-trace-refs.
_PyObject_IsFreed()regression, it detected
PyObject._ob_next=NULLas a bug (when Python is built using
./configure --with-pydebug --with-trace-refs). For example, the
Py_Noneobject is not tracked by
sys.getobjects()circular list. Parent object type:
- 2019-10-07: Ensure that objects entering the GC are valid
- 2019-09-09: visit_decref(): add an assertion to check that the object is not freed
- 2019-09-05: reference counter issue in signal module. Missing
Py_INCREF()in the initialization of the
_signalmodule. Crash occurs while
_PyImport_Cleanup()is clearing the
_signalmodule. Parent object type:
- 2019-03-21: Add gc.enable_object_debugger(): detect corrupted Python objects in the GC
- 2018-07-10: int(s), float(s) and others may cause segmentation fault. Buffer overflow in an
intobject. Parent object type:
- 2018-06-07: contextvars: hamt_alloc() must initialize h_root and h_count fields.
hamt_alloc()tracks an object in the GC before it is fully initialized:
h_rootwhich is not initialized yet. Parent object type:
- 2017-08-10: Segfault in gcmodule.c:360 visit_decref (PyObject_IS_GC(op)). Crash occurs in 3rd party project pyETL: no reproducer has been provided.
- 2016-07-17: Segfault in gcmodule.c:360 visit_decref. Related to pip, wheel and cffi.
Parent object type:
- 2014-02-06: python: Modules/gcmodule.c:379: visit_decref: Assertion
‘((gc)->gc.gc_refs >> (1)) != 0’ failed. Crash at Python exit related to
daemon threads spawned by asyncio. Also someone reported a bug in cx_Oracle,
likely a corrupted exception: crash in visit_decref() called by
Parent object type:
traceback, visited object type:
- 2013-02-19: python-2.7.3-r3: crash in visit_decref(). Application using numpy, matplotlib,
expat, and cElementTree. Parent object type:
- 2012-07-01: SEGFAULT in visit_decref. Reference counting issue.
Parent object type:
- 2008-04-04: Python-2.5.2: crash in visit_decref () at Modules/gcmodule.c:270.
Bug in a C extension,
char*string passed as a