Thread states and the global interpreter lock

A menos que seja uma construção com threads livres do CPython, o interpretador Python não é totalmente seguro para thread. Para oferecer suporte a programas Python multithread, existe uma trava global, chamada de trava global do interpretador ou GIL, que deve ser mantida pela thread atual antes que ela possa acessar objetos Python com segurança. Sem a trava, mesmo as operações mais simples podem causar problemas em um programa multithread: por exemplo, quando duas threads incrementam simultaneamente a contagem de referências do mesmo objeto, a contagem de referências pode acabar sendo incrementada apenas uma vez, no lugar de duas.

Portanto, existe a regra de que somente a thread que adquiriu a GIL pode operar em objetos Python ou chamar funções da API C/Python. Para emular a simultaneidade de execução, o interpretador tenta alternar threads regularmente (consulte sys.setswitchinterval()). A trava também é liberada em operações de E/S potencialmente bloqueantes, como ler ou escrever um arquivo, para que outras threads Python possam ser executadas enquanto isso.

O interpretador Python mantém algumas informações contábeis específicas da thread dentro de uma estrutura de dados chamada PyThreadState, conhecida como estado de thread. Cada thread do SO possui um ponteiro local da thread para um PyThreadState; um estado de thread referenciado por este ponteiro é considerado anexado.

A thread can only have one attached thread state at a time. An attached thread state is typically analogous with holding the GIL, except on free-threaded builds. On builds with the GIL enabled, attaching a thread state will block until the GIL can be acquired. However, even on builds with the GIL disabled, it is still required to have an attached thread state to call most of the C API.

Em geral, sempre haverá um estado de thread anexado ao usar a API C do Python. Somente em alguns casos específicos (como em um bloco Py_BEGIN_ALLOW_THREADS) a thread não terá um estado de thread anexado. Em caso de dúvida, verifique se PyThreadState_GetUnchecked() retorna NULL.

Desvinculação do estado de thread do código de extensão

A maior parte do código de extensão que manipula o estado de thread tem a seguinte estrutura simples:

Salve o estado de thread em uma variável local.
... Faça alguma operação de E/S com bloqueio...
Restaure o estado de thread da variável local.

Isso é tão comum que existe um par de macros para simplificá-lo:

Py_BEGIN_ALLOW_THREADS
... Faça alguma operação de E/S com bloqueio...
Py_END_ALLOW_THREADS

A macro Py_BEGIN_ALLOW_THREADS abre um novo bloco e declara uma variável local oculta; a macro Py_END_ALLOW_THREADS fecha o bloco.

O bloco acima se expande para o seguinte código:

PyThreadState *_save;

_save = PyEval_SaveThread();
... Faça alguma operação de E/S com bloqueio...
PyEval_RestoreThread(_save);

Veja como essas funções atuam:

O estado de thread anexado mantém a GIL para todo o interpretador. Ao desanexar o estado de thread anexado, a GIL é liberada, permitindo que outras threads anexem um estado de thread a sua própria thread, assim obtendo a GIL e podendo iniciar a execução. O ponteiro para o estado de thread anexado anterior é armazenado como uma variável local. Ao chegar a Py_END_ALLOW_THREADS, o estado de thread que foi anteriormente anexado é passado para PyEval_RestoreThread(). Esse função será bloqueada até que outro libere seu estado de thread, permitindo assim que o antigo estado de thread seja re-anexado e a API C possa ser novamente chamada.

Para construções com threads livres, a GIL normalmente está fora de questão, mas a separação do estado de thread ainda é necessário para E/S com bloqueio e operações longas. A diferença é que threads não precisam esperar que a GIL seja liberada para anexar seu estado de thread, permitindo verdadeiro paralelismo de vários núcleos.

Nota

Chamar funções de E/S do sistema é o caso de uso mais comum para desanexar o estado de thread, mas também pode ser útil antes de chamar cálculos de longa duração que não precisam de acesso a Python objeto, como funções de compressão ou criptografia operando em buffers de memória. Por exemplo, os módulos dos padrões zlib e hashlib desanexam o estado de thread ao compactar ou fazer hashing de dados.

APIs

The following macros are normally used without a trailing semicolon; look for example usage in the Python source distribution.

Nota

These macros are still necessary on the free-threaded build to prevent deadlocks.

Py_BEGIN_ALLOW_THREADS
Parte da ABI Estável.

Esta macro se expande para { PyThreadState *_save; _save = PyEval_SaveThread();. Observe que ele contém uma chave de abertura; ele deve ser combinado com a seguinte macro Py_END_ALLOW_THREADS. Veja acima para uma discussão mais aprofundada desta macro.

Py_END_ALLOW_THREADS
Parte da ABI Estável.

Esta macro se expande para PyEval_RestoreThread(_save); }. Observe que ele contém uma chave de fechamento; ele deve ser combinado com uma macro Py_BEGIN_ALLOW_THREADS anterior. Veja acima para uma discussão mais aprofundada desta macro.

Py_BLOCK_THREADS
Parte da ABI Estável.

Esta macro se expande para PyEval_RestoreThread(_save);: é equivalente a Py_END_ALLOW_THREADS sem a chave de fechamento.

Py_UNBLOCK_THREADS
Parte da ABI Estável.

Esta macro se expande para _save = PyEval_SaveThread();: é equivalente a Py_BEGIN_ALLOW_THREADS sem a chave de abertura e declaração de variável.

Threads não-Python criadas

When threads are created using the dedicated Python APIs (such as the threading module), a thread state is automatically associated to them and the code shown above is therefore correct. However, when threads are created from C (for example by a third-party library with its own thread management), they don’t hold the GIL, because they don’t have an attached thread state.

Se você precisar chamar código Python a partir dessas threads (geralmente isso fará parte de uma API de retorno de chamada fornecida pela biblioteca de terceiros mencionada anteriormente), primeiro você deve registrar essas threads com o interpretador criando um estado de thread anexado antes de começar a usar a API C/Python. Quando terminar, você deve desanexar o estado de thread e, finalmente, liberá-lo.

As funções PyGILState_Ensure() e PyGILState_Release() fazem tudo isso automaticamente. O padrão típico para chamar o Python a partir de uma thread C é:

PyGILState_STATE gstate;
gstate = PyGILState_Ensure();

/* Executa ações do Python aqui. */
result = ChamaAlgumaFunção();
/* avalia resultado ou trata de exceção */

/* Libera a thread. Na API do Python permitida além deste ponto. */
PyGILState_Release(gstate);

Note que as funções PyGILState_* pressupõem a existência de apenas um interpretador global (criado automaticamente por Py_Initialize()). O Python oferece suporte à criação de interpretadores adicionais (usando Py_NewInterpreter()), mas a combinação de múltiplos interpretadores com a API PyGILState_* não é suportada. Isso ocorre porque PyGILState_Ensure() e funções similares, por padrão, anexam um estado de thread para o interpretador principal, o que significa que a thread não pode interagir de forma segura com o subinterpretador que a chamou.

Suporte a subinterpretadores em threads não-Python

Se você deseja oferecer suporte a subinterpretadores com threads que não foram criadas pelo Python, você deve usar a API PyThreadState_* em vez da API tradicional PyGILState_*.

Em particular, você deve armazenar o estado do interpretador da função que a chamou e passá-lo para PyThreadState_New(), o que garantirá que o estado de thread esteja direcionado para o interpretador correto:

/* O valor de retorno de PyInterpreterState_Get()
   da função que criou esta thread. */
PyInterpreterState *interp = ThreadData->interp;
PyThreadState *tstate = PyThreadState_New(interp);
PyThreadState_Swap(tstate);

/* GIL do subinterpretador é agora retido.
   Executa ações do Python aqui. */
result = ChamaAlgumaFunção();
/* avalia resultado ou trata a execução */

/* Destroy the thread state. No Python API allowed beyond this point. */
PyThreadState_Clear(tstate);
PyThreadState_DeleteCurrent();

Cuidados com o uso de fork()

Outro aspecto importante a observar sobre threads é o seu comportamento diante da chamada de fork() da linguagem C. Na maioria dos sistemas com fork(), após um processo ser criado (“fork”), apenas a thread que emitiu o fork continuará existindo. Isso tem um impacto concreto tanto na forma como as travas devem ser gerenciados quanto em todo o estado armazenado no ambiente de execução do CPython.

The fact that only the “current” thread remains means any locks held by other threads will never be released. Python solves this for os.fork() by acquiring the locks it uses internally before the fork, and releasing them afterwards. In addition, it resets any Lock objects in the child. When extending or embedding Python, there is no way to inform Python of additional (non-Python) locks that need to be acquired before or reset after a fork. OS facilities such as pthread_atfork() would need to be used to accomplish the same thing. Additionally, when extending or embedding Python, calling fork() directly rather than through os.fork() (and returning to or calling into Python) may result in a deadlock by one of Python’s internal locks being held by a thread that is defunct after the fork. PyOS_AfterFork_Child() tries to reset the necessary locks, but is not always able to.

The fact that all other threads go away also means that CPython’s runtime state there must be cleaned up properly, which os.fork() does. This means finalizing all other PyThreadState objects belonging to the current interpreter and all other PyInterpreterState objects. Due to this and the special nature of the “main” interpreter, fork() should only be called in that interpreter’s “main” thread, where the CPython global runtime was originally initialized. The only exception is if exec() will be called immediately after.

APIs de alto nível

These are the most commonly used types and functions when writing multi-threaded C extensions.

type PyThreadState
Parte da API Limitada (como uma estrutura opaca).

This data structure represents the state of a single thread. The only public data member is:

PyInterpreterState *interp

This thread’s interpreter state.

void PyEval_InitThreads()
Parte da ABI Estável.

Função descontinuada que não faz nada.

In Python 3.6 and older, this function created the GIL if it didn’t exist.

Alterado na versão 3.9: The function now does nothing.

Alterado na versão 3.7: Esta função agora é chamada por Py_Initialize(), então não há mais necessidade de você chamá-la.

Alterado na versão 3.2: Esta função não pode mais ser chamada antes de Py_Initialize().

Descontinuado desde a versão 3.9.

PyThreadState *PyEval_SaveThread()
Parte da ABI Estável.

Detach the attached thread state and return it. The thread will have no thread state upon returning.

void PyEval_RestoreThread(PyThreadState *tstate)
Parte da ABI Estável.

Set the attached thread state to tstate. The passed thread state should not be attached, otherwise deadlock ensues. tstate will be attached upon returning.

Nota

Calling this function from a thread when the runtime is finalizing will hang the thread until the program exits, even if the thread was not created by Python. Refer to Cautions regarding runtime finalization for more details.

Alterado na versão 3.14: Hangs the current thread, rather than terminating it, if called while the interpreter is finalizing.

PyThreadState *PyThreadState_Get()
Parte da ABI Estável.

Return the attached thread state. If the thread has no attached thread state, (such as when inside of Py_BEGIN_ALLOW_THREADS block), then this issues a fatal error (so that the caller needn’t check for NULL).

See also PyThreadState_GetUnchecked().

PyThreadState *PyThreadState_GetUnchecked()

Similar to PyThreadState_Get(), but don’t kill the process with a fatal error if it is NULL. The caller is responsible to check if the result is NULL.

Adicionado na versão 3.13: In Python 3.5 to 3.12, the function was private and known as _PyThreadState_UncheckedGet().

PyThreadState *PyThreadState_Swap(PyThreadState *tstate)
Parte da ABI Estável.

Set the attached thread state to tstate, and return the thread state that was attached prior to calling.

This function is safe to call without an attached thread state; it will simply return NULL indicating that there was no prior thread state.

Nota

Similar to PyGILState_Ensure(), this function will hang the thread if the runtime is finalizing.

GIL-state APIs

The following functions use thread-local storage, and are not compatible with sub-interpreters:

type PyGILState_STATE
Parte da ABI Estável.

The type of the value returned by PyGILState_Ensure() and passed to PyGILState_Release().

enumerator PyGILState_LOCKED

The GIL was already held when PyGILState_Ensure() was called.

enumerator PyGILState_UNLOCKED

The GIL was not held when PyGILState_Ensure() was called.

PyGILState_STATE PyGILState_Ensure()
Parte da ABI Estável.

Ensure that the current thread is ready to call the Python C API regardless of the current state of Python, or of the attached thread state. This may be called as many times as desired by a thread as long as each call is matched with a call to PyGILState_Release(). In general, other thread-related APIs may be used between PyGILState_Ensure() and PyGILState_Release() calls as long as the thread state is restored to its previous state before the Release(). For example, normal usage of the Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS macros is acceptable.

The return value is an opaque “handle” to the attached thread state when PyGILState_Ensure() was called, and must be passed to PyGILState_Release() to ensure Python is left in the same state. Even though recursive calls are allowed, these handles cannot be shared - each unique call to PyGILState_Ensure() must save the handle for its call to PyGILState_Release().

When the function returns, there will be an attached thread state and the thread will be able to call arbitrary Python code. Failure is a fatal error.

Aviso

Calling this function when the runtime is finalizing is unsafe. Doing so will either hang the thread until the program ends, or fully crash the interpreter in rare cases. Refer to Cautions regarding runtime finalization for more details.

Alterado na versão 3.14: Hangs the current thread, rather than terminating it, if called while the interpreter is finalizing.

void PyGILState_Release(PyGILState_STATE)
Parte da ABI Estável.

Release any resources previously acquired. After this call, Python’s state will be the same as it was prior to the corresponding PyGILState_Ensure() call (but generally this state will be unknown to the caller, hence the use of the GILState API).

Every call to PyGILState_Ensure() must be matched by a call to PyGILState_Release() on the same thread.

PyThreadState *PyGILState_GetThisThreadState()
Parte da ABI Estável.

Get the attached thread state for this thread. May return NULL if no GILState API has been used on the current thread. Note that the main thread always has such a thread-state, even if no auto-thread-state call has been made on the main thread. This is mainly a helper/diagnostic function.

Nota

This function may return non-NULL even when the thread state is detached. Prefer PyThreadState_Get() or PyThreadState_GetUnchecked() for most cases.

Ver também

PyThreadState_Get()

int PyGILState_Check()

Return 1 if the current thread is holding the GIL and 0 otherwise. This function can be called from any thread at any time. Only if it has had its thread state initialized via PyGILState_Ensure() will it return 1. This is mainly a helper/diagnostic function. It can be useful for example in callback contexts or memory allocation functions when knowing that the GIL is locked can allow the caller to perform sensitive actions or otherwise behave differently.

Nota

If the current Python process has ever created a subinterpreter, this function will always return 1. Prefer PyThreadState_GetUnchecked() for most cases.

Adicionado na versão 3.4.

APIs de baixo nível

PyThreadState *PyThreadState_New(PyInterpreterState *interp)
Parte da ABI Estável.

Create a new thread state object belonging to the given interpreter object. An attached thread state is not needed.

void PyThreadState_Clear(PyThreadState *tstate)
Parte da ABI Estável.

Reset all information in a thread state object. tstate must be attached

Alterado na versão 3.9: This function now calls the PyThreadState.on_delete callback. Previously, that happened in PyThreadState_Delete().

Alterado na versão 3.13: The PyThreadState.on_delete callback was removed.

void PyThreadState_Delete(PyThreadState *tstate)
Parte da ABI Estável.

Destroy a thread state object. tstate should not be attached to any thread. tstate must have been reset with a previous call to PyThreadState_Clear().

void PyThreadState_DeleteCurrent(void)

Detach the attached thread state (which must have been reset with a previous call to PyThreadState_Clear()) and then destroy it.

No thread state will be attached upon returning.

PyFrameObject *PyThreadState_GetFrame(PyThreadState *tstate)
Parte da ABI Estável desde a versão 3.10.

Get the current frame of the Python thread state tstate.

Return a strong reference. Return NULL if no frame is currently executing.

See also PyEval_GetFrame().

tstate must not be NULL, and must be attached.

Adicionado na versão 3.9.

uint64_t PyThreadState_GetID(PyThreadState *tstate)
Parte da ABI Estável desde a versão 3.10.

Get the unique thread state identifier of the Python thread state tstate.

tstate must not be NULL, and must be attached.

Adicionado na versão 3.9.

PyInterpreterState *PyThreadState_GetInterpreter(PyThreadState *tstate)
Parte da ABI Estável desde a versão 3.10.

Get the interpreter of the Python thread state tstate.

tstate must not be NULL, and must be attached.

Adicionado na versão 3.9.

void PyThreadState_EnterTracing(PyThreadState *tstate)

Suspend tracing and profiling in the Python thread state tstate.

Resume them using the PyThreadState_LeaveTracing() function.

Adicionado na versão 3.11.

void PyThreadState_LeaveTracing(PyThreadState *tstate)

Resume tracing and profiling in the Python thread state tstate suspended by the PyThreadState_EnterTracing() function.

See also PyEval_SetTrace() and PyEval_SetProfile() functions.

Adicionado na versão 3.11.

int PyUnstable_ThreadState_SetStackProtection(PyThreadState *tstate, void *stack_start_addr, size_t stack_size)
Esta é uma API Instável. Isso pode se alterado sem aviso em lançamentos menores.

Set the stack protection start address and stack protection size of a Python thread state.

On success, return 0. On failure, set an exception and return -1.

CPython implements recursion control for C code by raising RecursionError when it notices that the machine execution stack is close to overflow. See for example the Py_EnterRecursiveCall() function. For this, it needs to know the location of the current thread’s stack, which it normally gets from the operating system. When the stack is changed, for example using context switching techniques like the Boost library’s boost::context, you must call PyUnstable_ThreadState_SetStackProtection() to inform CPython of the change.

Call PyUnstable_ThreadState_SetStackProtection() either before or after changing the stack. Do not call any other Python C API between the call and the stack change.

See PyUnstable_ThreadState_ResetStackProtection() for undoing this operation.

Adicionado na versão 3.15.

void PyUnstable_ThreadState_ResetStackProtection(PyThreadState *tstate)
Esta é uma API Instável. Isso pode se alterado sem aviso em lançamentos menores.

Reset the stack protection start address and stack protection size of a Python thread state to the operating system defaults.

See PyUnstable_ThreadState_SetStackProtection() for an explanation.

Adicionado na versão 3.15.

PyObject *PyThreadState_GetDict()
Retorna valor: Referência emprestada. Parte da ABI Estável.

Return a dictionary in which extensions can store thread-specific state information. Each extension should use a unique key to use to store state in the dictionary. It is okay to call this function when no thread state is attached. If this function returns NULL, no exception has been raised and the caller should assume no thread state is attached.

void PyEval_AcquireThread(PyThreadState *tstate)
Parte da ABI Estável.

Attach tstate to the current thread, which must not be NULL or already attached.

The calling thread must not already have an attached thread state.

Nota

Calling this function from a thread when the runtime is finalizing will hang the thread until the program exits, even if the thread was not created by Python. Refer to Cautions regarding runtime finalization for more details.

Alterado na versão 3.8: Updated to be consistent with PyEval_RestoreThread(), Py_END_ALLOW_THREADS(), and PyGILState_Ensure(), and terminate the current thread if called while the interpreter is finalizing.

Alterado na versão 3.14: Hangs the current thread, rather than terminating it, if called while the interpreter is finalizing.

PyEval_RestoreThread() is a higher-level function which is always available (even when threads have not been initialized).

void PyEval_ReleaseThread(PyThreadState *tstate)
Parte da ABI Estável.

Detach the attached thread state. The tstate argument, which must not be NULL, is only used to check that it represents the attached thread state — if it isn’t, a fatal error is reported.

PyEval_SaveThread() is a higher-level function which is always available (even when threads have not been initialized).

Asynchronous notifications

A mechanism is provided to make asynchronous notifications to the main interpreter thread. These notifications take the form of a function pointer and a void pointer argument.

int Py_AddPendingCall(int (*func)(void*), void *arg)
Parte da ABI Estável.

Schedule a function to be called from the main interpreter thread. On success, 0 is returned and func is queued for being called in the main thread. On failure, -1 is returned without setting any exception.

When successfully queued, func will be eventually called from the main interpreter thread with the argument arg. It will be called asynchronously with respect to normally running Python code, but with both these conditions met:

func must return 0 on success, or -1 on failure with an exception set. func won’t be interrupted to perform another asynchronous notification recursively, but it can still be interrupted to switch threads if the thread state is detached.

This function doesn’t need an attached thread state. However, to call this function in a subinterpreter, the caller must have an attached thread state. Otherwise, the function func can be scheduled to be called from the wrong interpreter.

Aviso

This is a low-level function, only useful for very special cases. There is no guarantee that func will be called as quick as possible. If the main thread is busy executing a system call, func won’t be called before the system call returns. This function is generally not suitable for calling Python code from arbitrary C threads. Instead, use the PyGILState API.

Adicionado na versão 3.1.

Alterado na versão 3.9: If this function is called in a subinterpreter, the function func is now scheduled to be called from the subinterpreter, rather than being called from the main interpreter. Each subinterpreter now has its own list of scheduled calls.

Alterado na versão 3.12: This function now always schedules func to be run in the main interpreter.

int Py_MakePendingCalls(void)
Parte da ABI Estável.

Execute all pending calls. This is usually executed automatically by the interpreter.

Esta função retorna 0 em caso de sucesso e retorna -1 com uma exceção definida em caso de falha.

If this is not called in the main thread of the main interpreter, this function does nothing and returns 0. The caller must hold an attached thread state.

Adicionado na versão 3.1.

Alterado na versão 3.12: This function only runs pending calls in the main interpreter.

int PyThreadState_SetAsyncExc(unsigned long id, PyObject *exc)
Parte da ABI Estável.

Asynchronously raise an exception in a thread. The id argument is the thread id of the target thread; exc is the exception object to be raised. This function does not steal any references to exc. To prevent naive misuse, you must write your own C extension to call this. Must be called with an attached thread state. Returns the number of thread states modified; this is normally one, but will be zero if the thread id isn’t found. If exc is NULL, the pending exception (if any) for the thread is cleared. This raises no exceptions.

Alterado na versão 3.7: The type of the id parameter changed from long to unsigned long.

Operating system thread APIs

PYTHREAD_INVALID_THREAD_ID

Sentinel value for an invalid thread ID.

This is currently equivalent to (unsigned long)-1.

unsigned long PyThread_start_new_thread(void (*func)(void*), void *arg)
Parte da ABI Estável.

Start function func in a new thread with argument arg. The resulting thread is not intended to be joined.

func must not be NULL, but arg may be NULL.

On success, this function returns the identifier of the new thread; on failure, this returns PYTHREAD_INVALID_THREAD_ID.

The caller does not need to hold an attached thread state.

unsigned long PyThread_get_thread_ident(void)
Parte da ABI Estável.

Return the identifier of the current thread, which will never be zero.

This function cannot fail, and the caller does not need to hold an attached thread state.

Ver também

threading.get_ident()

PyObject *PyThread_GetInfo(void)
Parte da ABI Estável desde a versão 3.3.

Get general information about the current thread in the form of a struct sequence object. This information is accessible as sys.thread_info in Python.

On success, this returns a new strong reference to the thread information; on failure, this returns NULL with an exception set.

The caller must hold an attached thread state.

PY_HAVE_THREAD_NATIVE_ID

This macro is defined when the system supports native thread IDs.

unsigned long PyThread_get_thread_native_id(void)
Parte da ABI Estável on platforms with native thread IDs.

Get the native identifier of the current thread as it was assigned by the operating system’s kernel, which will never be less than zero.

This function is only available when PY_HAVE_THREAD_NATIVE_ID is defined.

This function cannot fail, and the caller does not need to hold an attached thread state.

void PyThread_exit_thread(void)
Parte da ABI Estável.

Terminate the current thread. This function is generally considered unsafe and should be avoided. It is kept solely for backwards compatibility.

This function is only safe to call if all functions in the full call stack are written to safely allow it.

Aviso

If the current system uses POSIX threads (also known as “pthreads”), this calls pthread_exit(3), which attempts to unwind the stack and call C++ destructors on some libc implementations. However, if a noexcept function is reached, it may terminate the process. Other systems, such as macOS, do unwinding.

On Windows, this function calls _endthreadex(), which kills the thread without calling C++ destructors.

In any case, there is a risk of corruption on the thread’s stack.

Descontinuado desde a versão 3.14.

void PyThread_init_thread(void)
Parte da ABI Estável.

Initialize PyThread* APIs. Python executes this function automatically, so there’s little need to call it from an extension module.

int PyThread_set_stacksize(size_t size)
Parte da ABI Estável.

Set the stack size of the current thread to size bytes.

This function returns 0 on success, -1 if size is invalid, or -2 if the system does not support changing the stack size. This function does not set exceptions.

The caller does not need to hold an attached thread state.

size_t PyThread_get_stacksize(void)
Parte da ABI Estável.

Return the stack size of the current thread in bytes, or 0 if the system’s default stack size is in use.

The caller does not need to hold an attached thread state.