Python’s Global Interpreter Lock (GIL) is the mechanism which ensures only one (C)Python instruction is being executed at any one time. While an apparent major limitation, its practical effect are often small if following steps are followed.

What do I need to do about GIL?

-1. Do not panic! Python is routinely used in efficient, scalable programs.

  1. Whenever possible use existing libraries (numpy, tensorflow, dask, etc) which have thought about all this for you.

  2. If you are writing a Python extension, and a function in it may spend a long (continuous) period not executing Python instructions, then release the GIL. A few milliseconds probably counts as “long”, 15 milliseconds is definitely “long” (see bellow).

    If you do release the GIL, you must re-acquire it before calls into the Python C API.

  3. If you need to improve performance of a Python program that is executing a lot of Python instructions (not e.g. doing numpy computations, or waiting for I/O, etc) then use multiprocessing

What do I need to know about the GIL

  1. Holding the GIL is not a guarantee that thread-unsafe C libraries will be safe to use (for example a thread that does not need GIL can continue to access these libraries). See also below.

  2. A thread holding GIL may be interrupted any time a Python interpreter instruction is executed in that thread, in which case execution is suspended and it temporarily looses the GIL.

    Hence GIL is never a substitute for any type of locking at the Python level.

How to find out all about GIL yourself

Sometimes, reading descriptions does not help, you need to experiment to convince yourself what is going on. Fortunately it is easy to experiment with the GIL using Cython, which allows creation of compiled functions that can optionally release the GIL.

Here are the ingredients that will allow us the experiment with GIL.

Releasing the GIL

The following Cython function demonstrates a section of code for which the GIL has been released. Note that the C-language printf is used so that there is no need to call back into Python.

from libc.stdio cimport printf

def fn1():
    with nogil:
        printf("No GIL!\n")        

Sleeping to simulate work in the thread

We use the c-library function usleep to block the thread to simulate what in a real-world situation would have been some useful computation:

from posix.unistd cimport usleep

def fn2():
    usleep(1_000_000)

Calling back into python

If a section of Cython code has not released the GIL it can call back into Python. We will do this simply by calling print instead of printf. Note that since print will execute Python instructions, control may switch to another thread!

def workCA():
    printf("A - start\n")
    usleep(1_000_000)
    print("A - mid\n")
    usleep(1_000_000)
    printf("A - end\n")

A threaded Python program

In Python we can easily start multiple threads using the threading library as follows:

import threading
import time

def SA():
    for i in range(5):
        pass

def SB():
    for i in range(5):
        pass
        
t1=threading.Thread(target=SA)
t2=threading.Thread(target=SB)

t1.start()
time.sleep(0.05)
t2.start()

t1.join()
t2.join()

Easy use of Cython during development

I use the following header so that Cython can be imported straight into a module without a compilation step. This makes it easy to iteratively develop Cython programs

import pyximport; pyximport.install()

Putting it together

Here is the simplest examples, which shows that if you release the GIL multiple threads will run in parallel:

def workNA():
    with nogil:
        printf("A - start\n")        
        usleep(1_000_000)
        printf("A - mid\n")            
        usleep(1_000_000)
        printf("A - end\n")                


def workNB():
    with nogil:
        printf("B - start\n")        
        usleep(1_000_000)
        printf("B - mid \n")            
        usleep(1_000_000)
        printf("B - end \n")

Five iterations in each of two threads will complete in 10 seconds as opposed to 20 seconds if the GIL had not been released!

Some production examples to look at

  • Numpy: see e.g. https://numpy.org/doc/stable/reference/c-api/array.html#threading-support

  • The built-in _sre module, which does not release the GIL (see e.g. https://bugs.python.org/issue1366311)