Misc notes

datetime strptime

Documentation: https://docs.python.org/dev/library/datetime.html#strftime-and-strptime-behavior

Code:

date = datetime.datetime.strptime(date, '%Y-%m-%d %H:%M:%S %z')

year-month-day hour:minute:second timezone:

%Y-%m-%d %H:%M:%S %z

Python builtin types

Builtin scalar types:

  • bool, int, float, complex, bytes, str

Builtin container types:

  • Sequence: tuple, list

  • Mapping: dict, frozendict

  • Set: frozenset, set

Singletons:

  • None, Ellipsis (...), False (bool(0)), True (bool(1))

  • Small integers: [-5; 1024], or [-5; 256] on Python 3.14 and older

  • Empty bytes and Unicode strings: b'' and ''

  • Latin-1 single letter, examples: '\0', 'a', '\xe9'

  • Empty tuple

  • Empty frozenset: frozenset() (removed from Python 3.10)

Python developer mode

=> implementated in Python 3.7 as: “python3 -X dev” or PYTHONDEVMODE=1 !

https://mail.python.org/pipermail/python-ideas/2016-March/039314.html

Strict developer mode:

PYTHONMALLOC=debug python3.6 -Werror -bb -X faulthandler script.py

Developer mode:

PYTHONMALLOC=debug python3.6 -Wd -b -X faulthandler script.py
  • Show DeprecationWarning and ResourceWarning warnings: python -Wd

  • Show BytesWarning warning: python -b

  • Enable faulthandler to get a Python traceback on segfault and fatal errors: python -X faulthandler

  • Debug hooks on Python memory allocators: PYTHONMALLOC=debug

  • Enable Python assertions (assert) and set __debug__ to True: remove (or just ignore) -O or -OO command line arguments

See also PYTHONASYNCIODEBUG=1 for asyncio.

pyupgrade

https://github.com/asottile/pyupgrade

AST changes

Things to do in Python 3.10

Debug ref leak: play with GC

gc.get_referrers().

Common referrers of a type:

  • globals(): current namespace

  • module.__dict__

  • type.__dict__

  • type methods: C function with a reference to the type instance

  • type.__mro__

Example with _multibytecodec.MultibyteCodec type:

import _codecs_jp
codec = _codecs_jp.getcodec('cp932')
t = type(codec)

>>> import gc
>>> refs=gc.get_referrers(t)
>>> refs[0]
{'__name__': '__main__', '__doc__': None, ...
>>> sorted(refs[0].keys())
['__builtins__', ..., '_codecs_jp', 'codec', 'encodings', 'gc', 'refs', 't']

# "t" variable sounds like the current namespace
>>> refs[0] is globals()
True

>>> refs[1]
<module '_multibytecodec' from '/home/vstinner/python/master/build/lib.linux-x86_64-3.10-pydebug/_multibytecodec.cpython-310d-x86_64-linux-gnu.so'>
>>> import _multibytecodec
>>> refs[1] is _multibytecodec
True

>>> refs[2]
<method 'encode' of '_multibytecodec.MultibyteCodec' objects>
>>> refs[2] is t.encode
True

>>> refs[3]
<method 'decode' of '_multibytecodec.MultibyteCodec' objects>
>>> refs[3] is t.decode
True

>>> refs[4]
(<class '_multibytecodec.MultibyteCodec'>, <class 'object'>)
>>> refs[4] is t.__mro__
True

>>> refs[5]
<slot wrapper '__getattribute__' of '_multibytecodec.MultibyteCodec' objects>
>>> refs[5] is t.__getattribute__
True

There are 6 objects which have a reference to t`:

  • globals() namespace

  • _multibytecodec: in the module state (see the C implementation)

  • t.encode, t.decode and t.__getattribute__ methods

  • t.__mro__ tuple

Bug only reproduced on Windows

Vectorcall

  • To implement a vectorcall function:

    • nargs = PyVectorcall_NARGS(nargs) gives you the number of positional arguments.

    • Positional arguments for function myfunc:

      if (!_PyArg_CheckPositional("myfunc", nargs, 0, 0)) {
          return NULL;
      }
      
    • Keyword arguments:

      Py_ssize_t nkwargs = 0;
      if (kwnames != NULL) {
          nkwargs = PyTuple_GET_SIZE(kwnames);
      }
      
    • No keyword arguments for function myfunc:

      if (!_PyArg_NoKwnames("myfunc", kwnames)) {
          return NULL;
      }
      
  • PY_VECTORCALL_ARGUMENTS_OFFSET constant is added to nargs when calling a method. In this case, self is args[-1].

  • Optimize calling an object: result = obj()

    • Add vectorcallfunc vectorcall to the PyCFunctionObject structure

    • Add tp_vectorcall_offset  = offsetof(PyCFunctionObject, vectorcall) to the type

    • Add Py_TPFLAGS_HAVE_VECTORCALL to the type tp_flags

  • Optimize the type creation: obj = MyType()

    • Add .tp_vectorcall = (vectorcallfunc)enumerate_vectorcall to the type

    • Keep .tp_new = enum_new in the type

  • PEP 590 – Vectorcall: a fast calling protocol for CPython

  • METH_FASTCALL calling convention

  • Get vectorcall of a callable object: PyVectorcall_Function(). Return NULL if the type has no Py_TPFLAGS_HAVE_VECTORCALL flag.

  • Call a function in C:

    • PyObject_Vectorcall()

    • PyObject_VectorcallDict()

    • _PyObject_FastCall()

    • PyObject_VectorcallMethod()

Build Python in 32-bit on Fedora

Install Python dependencies in 32-bit:

sudo dnf install glibc-devel.i686 libatomic.i686 bzip2-devel.i686 libffi-devel.i686 libuuid-devel.i686 ncurses-devel.i686 openssl-devel.i686 readline-devel.i686 xz-devel.i686 zlib-ng-compat-devel.i686

Build Python:

./configure CFLAGS="-m32" LDFLAGS="-m32"
sed -i -e 's!#define PY_HAVE_PERF_TRAMPOLINE 1!#undef PY_HAVE_PERF_TRAMPOLINE!g' pyconfig.h
make

I have to disable PY_HAVE_PERF_TRAMPOLINE to workaround a build issue.

C global variables checker

make check-c-globals runs Tools/c-analyzer/check-c-globals.py --format summary --traceback.

New global variables can be ignored by modifying Tools/c-analyzer/cpython/ignored.tsv.

Frozen modules

  • Python/frozen.c

  • Tools/build/freeze_modules.py

  • Makefile.pre.in:

    • make regen-frozen

    • FROZEN_FILES_IN

    • FROZEN_FILES_OUT

Old deepfreeze. No longer used in Python 3.13:

  • PYTHON_FOR_FREEZE

  • DEEPFREEZE_C

  • DEEPFREEZE_DEPS

Build Python in an Ubuntu container

Create an Ubuntu 22.04 container:

podman run --rm --name ubuntu-dev --hostname ubuntu-dev --interactive --tty ubuntu:22.04

In the container:

apt update && apt install --yes git make clang libssl-dev readline-dev
git clone https://github.com/python/cpython --depth=1
cd cpython
./configure --with-pydebug

make check-c-globals

If the test fails, edit Tools/c-analyzer/cpython/ignored.tsv.

See also MAX_SIZES in Tools/c-analyzer/cpython/_parser.py.

Free Threading

Dashboard

Alpine Linux

Install dependencies:

apk add gcc make musl-dev git

Emscripten

Documentation: https://devguide.python.org/getting-started/setup-building/#emscripten

Install Emscripten:

cd ~/dev/
git clone https://github.com/emscripten-core/emsdk
./emsdk/emsdk install 4.0.12
./emsdk/emsdk activate 4.0.12

Build Python (commands based on .github/workflows/reusable-emscripten.yml):

cd ~/python/main

# Configure build Python
python3 Platforms/emscripten configure-build-python -- --config-cache --with-pydebug
# Make build Python
python3 Platforms/emscripten make-build-python

# Make dependencies
python3 Platforms/emscripten make-dependencies
# Configure host Python
source ~/dev/emsdk/emsdk_env.sh
python3 Platforms/emscripten configure-host --host-runner node -- --config-cache
# Make host Python
python3 Platforms/emscripten make-host

Run Python with a script:

python3 Platforms/emscripten run script.py

Run the test suite:

python3 Platforms/emscripten run --test

Free Threading internals

  • Articles:

  • PEPs:

    • PEP 683 “Immortal Objects, Using a Fixed Refcount”

    • PEP 703 “Making the Global Interpreter Lock Optional in CPython” design document

    • PEP 779 “Criteria for supported status for free-threaded Python”

    • PEP 803: “abi3t”: Stable ABI for Free-Threaded Builds

  • Quiescent-State Based Reclamation (QSBR)

  • list uses QSBR with _PyListArray type

    • _PyListArray:

      typedef struct {
          Py_ssize_t allocated;
          PyObject *ob_item[];
      } _PyListArray;
      
    • _Py_CONTAINER_OF() gets _PyListArray from PyListObject.ob_item

  • set uses QSBR: set_table_resize() delays PyMem_Free(oldtable) using _PyMem_FreeDelayed().

  • dict uses QSBR for lock-less lookup

    • PyDict_GetItemRef() calls _Py_dict_lookup_threadsafe()

    • PyDict_SetItem() calls insertdict()

    • PyDict_SetItem() calls insertion_resize() if the dict needs to grow. Use _PyMem_FreeDelayed() on old keys and values if the dict is shared.

    • use QSBR (_PyObject_GC_SET_SHARED())

  • Biased reference counting

    • Biased reference counting (BRC) inter-thread queue: Python/brc.c

    • PyInterpreterState.brc state

  • PyThreadState

    • PyThreadState.state

      • _Py_THREAD_DETACHED

      • _Py_THREAD_ATTACHED

      • _Py_THREAD_SUSPENDED

      • _Py_THREAD_SHUTTING_DOWN

    • _PyThreadStateImpl.refcount

  • PyMutex: use a single byte

  • Critical section

    • Py_BEGIN_CRITICAL_SECTION()/Py_END_CRITICAL_SECTION()

    • Py_BEGIN_CRITICAL_SECTION2()/Py_END_CRITICAL_SECTION2()

    • Use PyThreadState.critical_section tagged pointer

    • _Py_CRITICAL_SECTION_INACTIVE

    • _Py_CRITICAL_SECTION_TWO_MUTEXES

    • _Py_CRITICAL_SECTION_MASK

    • Use PyObject.ob_mutex (PyMutex)

  • Atomic operations: Include/cpython/pyatomic.h (GCC/Clang, MSVC, std)

    • gcc: use GCC built-in functions such as __atomic_load_n()

    • msc: use MSVC intrinsics such as _InterlockedCompareExchange()

    • std: use C11 or C++11 atomics such as atomic_load()

  • Ref counting

    • PyUnstable_Object_IsUniquelyReferenced()

    • PyUnstable_Object_IsUniqueReferencedTemporary()

    • Py_SET_REFCNT()

    • _Py_TryXGetRef()

  • Stop the world

    • _PyEval_StopTheWorldAll(), _PyEval_StartTheWorldAll()

    • _PyEval_StopTheWorld(), _PyEval_StartTheWorld()

    • Examples of operations which have to stop/start the world:

      • PyRefTracer_SetTracer()

      • set __bases__ or __abstractmethods__ of a type

      • set __class__ of an object

      • faulthandler.dump_traceback()

      • gc.get_referents()

      • PyOS_BeforeFork()

      • _Py_ClearUnusedTLBC()

      • _PyFunction_ClearVersion()

  • mimalloc black magic

    • _PyThreadStateImpl.mimalloc state

    • mi_heap_visit_blocks(): visit all blocks in a heap

    • _mi_abandoned_pool_visit_blocks(): visit blocks in the per-interpreter abandoned pool

    • Objects/mimalloc/: C code

    • Include/internal/mimalloc/ header files

  • Thread-local bytecode (TLBC)

  • Reenable the GIL if a C extension is not compatible with Free Threading

    • _PyEval_IsGILEnabled()

    • _PyEval_EnableGILTransient()

    • _PyEval_EnableGILPermanent()

    • _PyEval_DisableGIL()

  • Garbage collector

    • Python/gc_free_threading.c

    • GC bits

      • _PyGC_BITS_TRACKED

      • _PyGC_BITS_FINALIZED

      • _PyGC_BITS_UNREACHABLE

      • _PyGC_BITS_FROZEN

      • _PyGC_BITS_SHARED

      • _PyGC_BITS_ALIVE

      • _PyGC_BITS_DEFERRED

  • Python Thread state

    • _PyThreadState_GET()

    • _Py_thread_local PyThreadState *_Py_tss_tstate

    • _Py_thread_local: thread_local, _Thread_local (C11), __declspec(thread), __thread (GCC).

    • _PyThreadState_GET() implemented as reading the _Py_thread_local variable in the common case.

    • Function call for stdlib extension modules built with Py_BUILD_CORE_MODULE

    • Free lists.

      • _Py_freelists_GET() gets from tstate with Free Threading

      • _Py_freelists_GET() gets from interp otherwise

  • PyInterpreterState

    • atexit: add ll_callbacks_lock

    • GC: add freeze_active

    • _import_state: add lazy_mutex

    • codecs_state: add search_path_mutex

    • _py_func_state: add mutex

    • type_cache_entry: add sequence

    • _Py_interp_cached_objects: add interned_mutex and descriptor_mutex

    • add mimalloc

    • add brc

    • add unique_ids

    • add weakref_locks

    • add tlbc_indices

    • (Py_STATS) add pystats_mutex

  • _PyThreadStateImpl

    • add c_stack_refs;

    • add gc;

    • add mimalloc;

    • add freelists;

    • add brc;

    • add refcounts;

    • add tlbc_index;

    • add suppress_co_const_immortalization;

    • add (Py_STATS) pystats_struct

    • add __padding

  • Python C API

  • Limited C API

    • WIP in Python 3.15

    • PEP 803: “abi3t”: Stable ABI for Free-Threaded Builds

    • Opaque PyObject

    • Early work to abstract access to PyObject: Py_REFCNT(), Py_SET_REFCNT(), Py_TYPE(), Py_SET_TYPE().

    • abi3t ABI tag

  • Gilectomy

    • Issues with obmalloc

    • GC challenge

    • _PyThreadState_GET(): need for Thread Local Storage (TLS)

PyPy project status (December 2025)

Matti Picus wrote in the numpy bug tracker:

PyPy is no longer under active development, and has not released a Python3.12 version.

Rebuild configure

On FreeBSD, install dependencies:

sudo pkg install autoconf automake autoconf-archive libtool pkgconf

Run:

autoreconf -ivf -Werror