Unstable tests¶
The multiprocessing tests leaked a lot of resources. Victor Stinner and others fixed dozens of bugs in these tests.
See also: Enable tracemalloc to get ResourceWarning traceback.
How to write reliable tests¶
Don’t use sleep as synchronization¶
Don’t use a sleep as a synchronization primitive between two threads or two processes. It will later, soon or later.
Threads: use threading.Event
Processes: use a pipe (os.pipe()), write a byte when read, read to wait
Don’t limit the maximum duration¶
Don’t make a test fail if it takes longer than a specified number of seconds. Example:
t1 = time.monotonic()
func()
t2 = time.monotonic()
self.assertLess(t2 - t1, 60.0) # cannot happen
Python has buildbot workers which are very slow where “cannot happen” does happen. In most cases, the maximum duration is not a bug in Python and so the test must not fail.
For example, test_time had a test to ensure that time.sleep(0.5) takes less than 0.7 seconds. The test started to fail on slow buildbots where it took 0.8 seconds: maximum extended to 1 second. The test has been modified later to no longer check the maximum duration.
Another example, a sleep of 100 ms took 2 seconds on “AMD64 OpenIndiana 3.x” buildbot: https://bugs.python.org/issue20336
Debug race conditions¶
Debug test relying on time.sleep() or asyncio.sleep()¶
For example, test_asyncio: test_run_coroutine_threadsafe_with_timeout() has a
race condition issue is caused by
await asyncio.sleep(0.05)
used in a test.
To reproduce the race condition, just use the smallest possible sleep of 1 nanosecond:
diff --git a/Lib/test/test_asyncio/test_tasks.py b/Lib/test/test_asyncio/test_tasks.py
index dde84b84b1..c94113712a 100644
--- a/Lib/test/test_asyncio/test_tasks.py
+++ b/Lib/test/test_asyncio/test_tasks.py
@@ -3160,7 +3160,7 @@ class RunCoroutineThreadsafeTests(test_utils.TestCase):
async def add(self, a, b, fail=False, cancel=False):
"""Wait 0.05 second and return a + b."""
- await asyncio.sleep(0.05)
+ await asyncio.sleep(1e-9)
if fail:
raise RuntimeError("Fail!")
if cancel:
And run the test in a loop until it fails:
./python -m test test_asyncio -m test_run_coroutine_threadsafe_with_timeout -v -F
Debug Dangling process¶
For example, debug test_multiprocessing_spawn which logs:
Warning -- Dangling processes: {<SpawnProcess(QueueManager-1576, stopped)>}
https://bugs.python.org/issue38447
Get cases:
./python -m test test_multiprocessing_spawn --list-cases > cases
Bisect:
./python -m test.bisect_cmd -i cases -o bisect1 -n 5 -N 500 test_multiprocessing_spawn -R 3:3 --fail-env-changed
Debug reap_children() warning¶
For example, test_concurrent_futures logs such warning:
0:27:13 load avg: 4.88 [416/419/1] test_concurrent_futures failed (env changed) (17 min 11 sec) -- running: test_capi (7 min 28 sec), test_gdb (8 min 49 sec), test_asyncio (23 min 23 sec)
beginning 6 repetitions
123456
.Warning -- reap_children() reaped child process 26487
.....
Warning -- multiprocessing.process._dangling was modified by test_concurrent_futures
Before: set()
After: {<weakref at 0x7fdc08f44e30; to 'SpawnProcess' at 0x7fdc0a467c30>}
https://bugs.python.org/issue38448
Run the test in a loop until it fails?
./python -m test test_concurrent_futures --fail-env-changed -F
If it’s not enough, spawn more jobs in parallel, example with 10 processes:
./python -m test test_concurrent_futures --fail-env-changed -F -j10
If it’s not enough, use the previous commands, but also inject some workload. For example, run a different terminal:
./python -m test -u all -r -F -j4
Hack reap_children() to detect more issues, sleep 100 ms before calling waitpid(WNOHANG):
diff --git a/Lib/test/support/__init__.py b/Lib/test/support/__init__.py
index 0f294c5b0f..d938ae6b16 100644
--- a/Lib/test/support/__init__.py
+++ b/Lib/test/support/__init__.py
@@ -2320,6 +2320,8 @@ def reap_children():
if not (hasattr(os, 'waitpid') and hasattr(os, 'WNOHANG')):
return
+ time.sleep(0.1)
+
# Reap all our dead child processes so we don't leave zombies around.
# These hog resources and might be causing some of the buildbots to die.
while True:
Untested function which might help, count the number of child processes of a process on Linux: Add support.get_child_processes().
Coredump in multiprocessing¶
FreeBSD buildbot workers were useful to detect crashes at Python exit, bugs
related to dangling threads. It helps to add a random sleep at Python exit, in
Modules/main.c
.
Multiprocessing issues¶
Open¶
2018-07-20: multiprocessing.Pool and ThreadPool leak resources after being deleted
2017-07-19: Missing multiprocessing.queues.SimpleQueue.close() method (OPEN).
Fixed, rejected, out of date¶
2018-12-05, multiprocessing: test_multiprocessing_fork: test_del_pool() leaks dangling threads and processes on AMD64 FreeBSD CURRENT Shared 3.x
2018-07-18: test_multiprocessing_spawn: Dangling processes leaked on AMD64 FreeBSD 10.x Shared 3.x
2018-07-03: asyncio: BaseEventLoop.close() shutdowns the executor without waiting causing leak of dangling threads (FIXED in Python 3.9).
2018-05-28, test_multiprocessing: test_multiprocessing_fork: dangling threads warning (commit: call Pool.join)
2017-07-28: test_multiprocessing_spawn and test_multiprocessing_forkserver leak dangling processes (commit: remove Process.daemon=True, call Process.join)
2017-07-24, multiprocessing: multiprocessing.Pool should join “dead” processes (commit)
2017-07-09, multiprocessing: multiprocessing.Queue.join_thread() does nothing if created and use in the same process (commit)
2017-06-08, multiprocessing: Add close() to multiprocessing.Process
2017-05-03: Emit a ResourceWarning in concurrent.futures executor destructors (OUT OF DATE).
2017-04-26: Emit ResourceWarning in multiprocessing Queue destructor (REJECTED).
2016-04-15, multiprocessing: test_multiprocessing_spawn leaves processes running in background. Add more checks to _test_multiprocessing to detect dangling processes and threads.
2015-11-18, multiprocessing: test_multiprocessing_spawn ResourceWarning with -Werror (commit: use closefd=False)
2011-08-18: Warning – multiprocessing.process._dangling was modified by test_multiprocessing (commit: test_multiprocessing.py calls the terminate() method of all classes).
Python issues¶
Open issues¶
Search for test_asyncio
, multiprocessing
tests.
2019-06-11: test__xxsubinterpreters fails randomly
Fixed issues¶
2018-05-16, socketserver: socketserver: Add an opt-in option to get Python 3.6 behavior on server_close()
2017-08-18, support: Make support.threading_cleanup() stricter (big issue with many fixes)
2017-08-18, test_logging: test_logging: ResourceWarning: unclosed socket
2017-08-18, socketserver: socketserver.ThreadingMixIn leaks running threads after server_close()
2017-08-09, socketserver: socketserver.ForkingMixIn.server_close() leaks zombie processes
Rejected, Not a Bug, Out of Date¶
Windows handles¶
Abandonned attempt to hunt for leak of Windows handles:
Unlimited recursion¶
Some specific unit tests rely on the exact C stack size and how Python detects stack overflow. These tests are fragile because each platform uses a different stack size and behaves differently on stack overflow. For example, the stack size can depend if Python is compiled using PGO or not (depend on functions inlining).
The support.infinite_recursion()
context manager reduces the risk of stack
overflow. Example of tests using it:
test_ast
test_exceptions
test_isinstance
test_json
test_pickle
test_traceback
test_tomllib: issue gh-108851
_Py_CheckRecursiveCall()
is a portable but not reliable test: basic counter
using sys.getrecursionlimit()
.
MSVC allows to implement PyOS_CheckStack()
(USE_STACKCHECK
macro is
defined) using alloca()
and catching STATUS_STACK_OVERFLOW
error.
If uses _resetstkoflw()
to reset the stack overflow flag.
See also Py_C_RECURSION_LIMIT
constant.
WASI explicitly sets the stack memory in configure.ac
:
dnl gh-117645: Set the memory size to 20 MiB, the stack size to 8 MiB,
dnl and move the stack first.
dnl https://github.com/WebAssembly/wasi-libc/issues/233
AS_VAR_APPEND([LDFLAGS_NODIST], [" -z stack-size=8388608 -Wl,--stack-first -Wl,--initial-memory=20971520"])
Tests¶
test_pickle: test_bad_getattr()
test_marshal: test_recursion_limit()
History¶
2019-04-29: macOS no longer specify stack size. Previously, it was set to 8 MiB (
-Wl,-stack_size,1000000
).2018-07-05: test_marshal: “Improve tests for the stack overflow in marshal.loads()”
2018-06-04: test_marshal: “Reduces maximum marshal recursion depth on release builds” on Windows
2014-11-01: MAX_MARSHAL_STACK_DEPTH sets to 1000 instead of 1500 on Windows
2013-07-07: Visual Studio project (PCbuild) now uses 4.2 MiB stack, instead of 2 MiB
2013-05-30: macOS sets the stack size to 8 MiB
2007-08-29: test_marshal: MAX_MARSHAL_STACK_DEPTH set to 1500 instead of 2000 on Windows for debug build
Notes¶
On FreeBSD, sudo sysctl -w 'kern.corefile =%N.%P.core'
command can be used
to include the pid in coredump filenames, since 2 processes can crash at the
same time.