Python Continuous Integration: Travis CI, AppVeyor, VSTS, Buildbots

The Night’s Watch is Fixing the CIs in the Darkness for You

Buildbot Watch: The watchers of CPython’s Continuous Integration

Buildbot Watch


  • Pablo Galindo Salgado

  • Victor Stinner

The Night’s Watch is Fixing the CIs in the Darkness for You by Pablo Galindo Salgado.

Python Test Suite

Python Test Suite can be found in Lib/test/ directory.

Python uses libregrtest test runner which is based on unittest with additional features:

  • -R 3:3 command to check for reference and memory leaks (don’t detect PyMem_RawMalloc leaks, only detect PyMem_Malloc and PyObject_Malloc leaks)

  • -u resources to enable extra tests. Examples:

  • -u cpu enables tests which use more CPU

  • -u largefile enables tests which use a lot of disk space

  • -m 4G enables tests which use a lot of memory and specify that we allow Python tests to allocate up to 4 GiB.

  • -w to re-run failed tests in verbose mode

  • -m, --fromfile, --matchfile to select tests

See devguide documentation:

Revert on fail


CPython buildbots:

GitHub Actions

Azure Pipelines PR

Buildbots notifications

Address Sanitizer (ASAN)

  • Buildbots sets the ASAN_OPTIONS env var set to: detect_leaks=0:allocator_may_return_null=1:handle_segv=0, see the master/custom/ config.



  • Skipped tests:

    • test___all__

    • test_capi

    • test_concurrent_futures

    • test_crypt

    • test_ctypes

    • test_decimal

    • test_faulthandler

    • test_hashlib

    • test_idle

    • test_interpreters

    • test_multiprocessing_fork

    • test_multiprocessing_forkserver

    • test_multiprocessing_spawn

    • test_peg_generator

    • test_tix

    • test_tk

    • test_tools

    • test_ttk_guionly

    • test_ttk_textonly

  • C macros:




  • test_decimal and test_io detects ASAN using:

        '-fsanitize=memory' in _cflags or
        '--with-memory-sanitizer' in _config_args
        '-fsanitize=address' in _cflags
  • bpo-45200: Address Sanitizer: libasan dead lock in pthread_create() (test_multiprocessing_fork.test_get() hangs)

  • bpo-42985: AMD64 Arch Linux Asan 3.x fails: command timed out: 1200 seconds without output


How to watch buildbots?

Email: [Python-Dev] How to watch buildbots?.

Report a failure

When a buildbot fails, I look at tests logs and I try to check if an issue has already been reported. For example, search for the test method in title (ex: “test_complex” for test_complex() method). If no result, search using the test filename (ex: “test_os” for Lib/test/ If there is no result, repeat with full text searchs (“All Text”). If you cannot find any open bug, create a new one:

  • The title should contain the test name, test method and the buildbot name. Example: “ test_posix: TestPosixSpawn fails on PPC64 Fedora 3.x”.

  • The description should contain the link to the buildbot failure. Try to identify useful parts of tests log and copy them in the description.

  • Fill the Python version field (ex: “3.8” for 3.x buildbots)

  • Select at least the “Tests” Component. You may select additional Components depending on the bug.

If a bug was already open, you may add a comment to mention that there is a new failure: add at least a link to buildbot name and a link to the failure.

And that’s all! Simple, isn’t it? At this stage, there is no need to investigate the test failure.

To finish, reply to the failure notification on the mailing list with a very short email: add a link to the existing or the freshly created issue, maybe copy one line of the failure and/or the issue title.

Bug example: issue33630.

Analyze a failure

Later, you may want to analyze these failures, but I consider that it’s a different job (different “maintenance task”). If you don’t feel able to analyze the bug, you may try to find someone who knows more than you about the failure.

For better bug reports, you can look at the [Changes] tab of a build failure, and try to identify which recent change introduced the regression. This task requires to follow recent commits, since sometimes the failure is old, it’s just that the test fails randomly depending on network issues, system load, or anything else. Sometimes, previous tests have side effects. Or the buildbot owner made a change on the system. There are many different explanation, it’s hard to write a complete list. It’s really on a case by case basis.

Hopefully, it’s now more common that a buildbot failure is obvious and caused by a very specific recent changes which can be found in the [Changes] tab.

OLD: AppVeyor

It is no longer used by Python.

OLD: Travis CI

Travis CI was removed from Python in December 2021 (commit).