From c5d0db796f846403200529c63c88286608525681 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Mon, 7 Aug 2023 15:40:59 -0600 Subject: [PATCH] Clarify Usage of "Reference Count" In the Docs PEP 683 (immortal objects) revealed some ways in which the Python documentation has been unnecessarily coupled to the implementation details of reference counts. In the end users should focus on reference ownership, including taking references and releasing them, rather than on how many reference counts an object has. This change updates the documentation to reflect that perspective. It also updates the docs relative to immortal objects in a handful of places. --- Doc/c-api/allocation.rst | 13 +++++---- Doc/c-api/arg.rst | 17 +++++++---- Doc/c-api/buffer.rst | 7 +++-- Doc/c-api/bytes.rst | 4 +-- Doc/c-api/exceptions.rst | 3 +- Doc/c-api/intro.rst | 59 ++++++++++++++++++++++----------------- Doc/c-api/module.rst | 2 +- Doc/c-api/object.rst | 13 +++++---- Doc/c-api/refcounting.rst | 47 ++++++++++++++++++++----------- Doc/c-api/sys.rst | 5 ++-- Doc/c-api/typeobj.rst | 10 ++++--- Doc/c-api/unicode.rst | 12 ++++---- Doc/glossary.rst | 11 +++++--- Doc/library/sys.rst | 3 ++ 14 files changed, 123 insertions(+), 83 deletions(-) diff --git a/Doc/c-api/allocation.rst b/Doc/c-api/allocation.rst index 44747e29643661..b3609c233156b6 100644 --- a/Doc/c-api/allocation.rst +++ b/Doc/c-api/allocation.rst @@ -29,12 +29,13 @@ Allocating Objects on the Heap .. c:macro:: PyObject_New(TYPE, typeobj) - Allocate a new Python object using the C structure type *TYPE* and the - Python type object *typeobj* (``PyTypeObject*``). - Fields not defined by the Python object header - are not initialized; the object's reference count will be one. The size of - the memory allocation is determined from the :c:member:`~PyTypeObject.tp_basicsize` field of - the type object. + Allocate a new Python object using the C structure type *TYPE* + and the Python type object *typeobj* (``PyTypeObject*``). + Fields not defined by the Python object header are not initialized. + The caller will own the only reference to the object + (i.e. its reference count will be one). + The size of the memory allocation is determined from the + :c:member:`~PyTypeObject.tp_basicsize` field of the type object. .. c:macro:: PyObject_NewVar(TYPE, typeobj, size) diff --git a/Doc/c-api/arg.rst b/Doc/c-api/arg.rst index dfbec82c457bc3..08d6cf788e299e 100644 --- a/Doc/c-api/arg.rst +++ b/Doc/c-api/arg.rst @@ -330,8 +330,10 @@ Other objects ``O`` (object) [PyObject \*] Store a Python object (without any conversion) in a C object pointer. The C - program thus receives the actual object that was passed. The object's reference - count is not increased. The pointer stored is not ``NULL``. + program thus receives the actual object that was passed. A new + :term:`strong reference` to the object is not created + (i.e. its reference count is not increased). + The pointer stored is not ``NULL``. ``O!`` (object) [*typeobject*, PyObject \*] Store a Python object in a C object pointer. This is similar to ``O``, but @@ -415,7 +417,8 @@ inside nested parentheses. They are: mutually exclude each other. Note that any Python object references which are provided to the caller are -*borrowed* references; do not decrement their reference count! +*borrowed* references; do not release them +(i.e. do not decrement their reference count)! Additional arguments passed to these functions must be addresses of variables whose type is determined by the format string; these are used to store values @@ -650,8 +653,10 @@ Building values Convert a C :c:type:`Py_complex` structure to a Python complex number. ``O`` (object) [PyObject \*] - Pass a Python object untouched (except for its reference count, which is - incremented by one). If the object passed in is a ``NULL`` pointer, it is assumed + Pass a Python object untouched but create a new + :term:`strong reference` to it + (i.e. its reference count is incremented by one). + If the object passed in is a ``NULL`` pointer, it is assumed that this was caused because the call producing the argument found an error and set an exception. Therefore, :c:func:`Py_BuildValue` will return ``NULL`` but won't raise an exception. If no exception has been raised yet, :exc:`SystemError` is @@ -661,7 +666,7 @@ Building values Same as ``O``. ``N`` (object) [PyObject \*] - Same as ``O``, except it doesn't increment the reference count on the object. + Same as ``O``, except it doesn't create a new :term:`strong reference`. Useful when the object is created by a call to an object constructor in the argument list. diff --git a/Doc/c-api/buffer.rst b/Doc/c-api/buffer.rst index 02b53ec149c733..8ca1c190dab9a9 100644 --- a/Doc/c-api/buffer.rst +++ b/Doc/c-api/buffer.rst @@ -102,7 +102,9 @@ a buffer, see :c:func:`PyObject_GetBuffer`. .. c:member:: PyObject *obj A new reference to the exporting object. The reference is owned by - the consumer and automatically decremented and set to ``NULL`` by + the consumer and automatically released + (i.e. reference count decremented) + and set to ``NULL`` by :c:func:`PyBuffer_Release`. The field is the equivalent of the return value of any standard C-API function. @@ -454,7 +456,8 @@ Buffer-related functions .. c:function:: void PyBuffer_Release(Py_buffer *view) - Release the buffer *view* and decrement the reference count for + Release the buffer *view* and release the :term:`strong reference` + (i.e. decrement the reference count) to the view's supporting object, ``view->obj``. This function MUST be called when the buffer is no longer being used, otherwise reference leaks may occur. diff --git a/Doc/c-api/bytes.rst b/Doc/c-api/bytes.rst index 110a98e19d8237..33b7d23ff85a3d 100644 --- a/Doc/c-api/bytes.rst +++ b/Doc/c-api/bytes.rst @@ -187,8 +187,8 @@ called with a non-bytes parameter. .. c:function:: void PyBytes_ConcatAndDel(PyObject **bytes, PyObject *newpart) Create a new bytes object in *\*bytes* containing the contents of *newpart* - appended to *bytes*. This version decrements the reference count of - *newpart*. + appended to *bytes*. This version releases the :term:`strong reference` + to *newpart* (i.e. decrements its reference count). .. c:function:: int _PyBytes_Resize(PyObject **bytes, Py_ssize_t newsize) diff --git a/Doc/c-api/exceptions.rst b/Doc/c-api/exceptions.rst index 8d8b0ac7dd9a93..ae45aaef238fec 100644 --- a/Doc/c-api/exceptions.rst +++ b/Doc/c-api/exceptions.rst @@ -99,7 +99,8 @@ For convenience, some of these functions will always return a This is the most common way to set the error indicator. The first argument specifies the exception type; it is normally one of the standard exceptions, - e.g. :c:data:`PyExc_RuntimeError`. You need not increment its reference count. + e.g. :c:data:`PyExc_RuntimeError`. You need not create a new + :term:`strong reference` to it (e.g. with :c:func:`Py_INCREF`). The second argument is an error message; it is decoded from ``'utf-8'``. diff --git a/Doc/c-api/intro.rst b/Doc/c-api/intro.rst index a3ccfa08baf706..ec2d8bc0ef02ef 100644 --- a/Doc/c-api/intro.rst +++ b/Doc/c-api/intro.rst @@ -287,52 +287,58 @@ true if (and only if) the object pointed to by *a* is a Python list. Reference Counts ---------------- -The reference count is important because today's computers have a finite (and -often severely limited) memory size; it counts how many different places there -are that have a reference to an object. Such a place could be another object, -or a global (or static) C variable, or a local variable in some C function. -When an object's reference count becomes zero, the object is deallocated. If -it contains references to other objects, their reference count is decremented. -Those other objects may be deallocated in turn, if this decrement makes their -reference count become zero, and so on. (There's an obvious problem with -objects that reference each other here; for now, the solution is "don't do -that.") +The reference count is important because today's computers have a finite +(and often severely limited) memory size; it counts how many different +places there are that have a :term:`strong reference` to an object. +Such a place could be another object, or a global (or static) C variable, +or a local variable in some C function. +When the last :term:`strong reference` to an object is released +(i.e. its reference count becomes zero), the object is deallocated. +If it contains references to other objects, those references are released. +Those other objects may be deallocated in turn, if there are no more +references to them, and so on. (There's an obvious problem with +objects that reference each other here; for now, the solution +is "don't do that.") .. index:: single: Py_INCREF() single: Py_DECREF() -Reference counts are always manipulated explicitly. The normal way is to use -the macro :c:func:`Py_INCREF` to increment an object's reference count by one, -and :c:func:`Py_DECREF` to decrement it by one. The :c:func:`Py_DECREF` macro +Reference counts are always manipulated explicitly. The normal way is +to use the macro :c:func:`Py_INCREF` to take a new reference to an +object (i.e. increment its reference count by one), +and :c:func:`Py_DECREF` to release that reference (i.e. decrement the +reference count by one). The :c:func:`Py_DECREF` macro is considerably more complex than the incref one, since it must check whether the reference count becomes zero and then cause the object's deallocator to be -called. The deallocator is a function pointer contained in the object's type -structure. The type-specific deallocator takes care of decrementing the -reference counts for other objects contained in the object if this is a compound +called. The deallocator is a function pointer contained in the object's type +structure. The type-specific deallocator takes care of releasing references +for other objects contained in the object if this is a compound object type, such as a list, as well as performing any additional finalization that's needed. There's no chance that the reference count can overflow; at least as many bits are used to hold the reference count as there are distinct memory locations in virtual memory (assuming ``sizeof(Py_ssize_t) >= sizeof(void*)``). Thus, the reference count increment is a simple operation. -It is not necessary to increment an object's reference count for every local -variable that contains a pointer to an object. In theory, the object's +It is not necessary to hold a :term:`strong reference` (i.e. increment +the reference count) for every local variable that contains a pointer +to an object. In theory, the object's reference count goes up by one when the variable is made to point to it and it goes down by one when the variable goes out of scope. However, these two cancel each other out, so at the end the reference count hasn't changed. The only real reason to use the reference count is to prevent the object from being deallocated as long as our variable is pointing to it. If we know that there is at least one other reference to the object that lives at least as long as -our variable, there is no need to increment the reference count temporarily. +our variable, there is no need to take a new :term:`strong reference` +(i.e. increment the reference count) temporarily. An important situation where this arises is in objects that are passed as arguments to C functions in an extension module that are called from Python; the call mechanism guarantees to hold a reference to every argument for the duration of the call. However, a common pitfall is to extract an object from a list and hold on to it -for a while without incrementing its reference count. Some other operation might -conceivably remove the object from the list, decrementing its reference count +for a while without taking a new reference. Some other operation might +conceivably remove the object from the list, releasing that reference, and possibly deallocating it. The real danger is that innocent-looking operations may invoke arbitrary Python code which could do this; there is a code path which allows control to flow back to the user from a :c:func:`Py_DECREF`, so @@ -340,7 +346,8 @@ almost any operation is potentially dangerous. A safe approach is to always use the generic operations (functions whose name begins with ``PyObject_``, ``PyNumber_``, ``PySequence_`` or ``PyMapping_``). -These operations always increment the reference count of the object they return. +These operations always create a new :term:`strong reference` +(i.e. increment the reference count) of the object they return. This leaves the caller with the responsibility to call :c:func:`Py_DECREF` when they are done with the result; this soon becomes second nature. @@ -356,7 +363,7 @@ to objects (objects are not owned: they are always shared). "Owning a reference" means being responsible for calling Py_DECREF on it when the reference is no longer needed. Ownership can also be transferred, meaning that the code that receives ownership of the reference then becomes responsible for -eventually decref'ing it by calling :c:func:`Py_DECREF` or :c:func:`Py_XDECREF` +eventually releasing it by calling :c:func:`Py_DECREF` or :c:func:`Py_XDECREF` when it's no longer needed---or passing on this responsibility (usually to its caller). When a function passes ownership of a reference on to its caller, the caller is said to receive a *new* reference. When no ownership is transferred, @@ -414,9 +421,9 @@ For example, the above two blocks of code could be replaced by the following It is much more common to use :c:func:`PyObject_SetItem` and friends with items whose references you are only borrowing, like arguments that were passed in to -the function you are writing. In that case, their behaviour regarding reference -counts is much saner, since you don't have to increment a reference count so you -can give a reference away ("have it be stolen"). For example, this function +the function you are writing. In that case, their behaviour regarding references +is much saner, since you don't have to take a new reference just so you +can give that reference away ("have it be stolen"). For example, this function sets all items of a list (actually, any mutable sequence) to a given item:: int diff --git a/Doc/c-api/module.rst b/Doc/c-api/module.rst index 8ca48c852d4e6f..6ef4eea6a07f73 100644 --- a/Doc/c-api/module.rst +++ b/Doc/c-api/module.rst @@ -498,7 +498,7 @@ state: .. note:: Unlike other functions that steal references, ``PyModule_AddObject()`` - only decrements the reference count of *value* **on success**. + only releases the reference to *value* **on success**. This means that its return value must be checked, and calling code must :c:func:`Py_DECREF` *value* manually on error. diff --git a/Doc/c-api/object.rst b/Doc/c-api/object.rst index e7d80dbf8c5c13..bc67fb60dc8bec 100644 --- a/Doc/c-api/object.rst +++ b/Doc/c-api/object.rst @@ -15,8 +15,8 @@ Object Protocol .. c:macro:: Py_RETURN_NOTIMPLEMENTED Properly handle returning :c:data:`Py_NotImplemented` from within a C - function (that is, increment the reference count of NotImplemented and - return it). + function (that is, create a new :term:`strong reference` + to NotImplemented and return it). .. c:function:: int PyObject_Print(PyObject *o, FILE *fp, int flags) @@ -320,11 +320,12 @@ Object Protocol When *o* is non-``NULL``, returns a type object corresponding to the object type of object *o*. On failure, raises :exc:`SystemError` and returns ``NULL``. This - is equivalent to the Python expression ``type(o)``. This function increments the - reference count of the return value. There's really no reason to use this + is equivalent to the Python expression ``type(o)``. + This function creates a new :term:`strong reference` to the return value. + There's really no reason to use this function instead of the :c:func:`Py_TYPE()` function, which returns a - pointer of type :c:expr:`PyTypeObject*`, except when the incremented reference - count is needed. + pointer of type :c:expr:`PyTypeObject*`, except when a new + :term:`strong reference` is needed. .. c:function:: int PyObject_TypeCheck(PyObject *o, PyTypeObject *type) diff --git a/Doc/c-api/refcounting.rst b/Doc/c-api/refcounting.rst index af25e8bb913685..e1782ab27f4f66 100644 --- a/Doc/c-api/refcounting.rst +++ b/Doc/c-api/refcounting.rst @@ -13,31 +13,36 @@ objects. .. c:function:: void Py_INCREF(PyObject *o) - Increment the reference count for object *o*. + Indicate taking a new :term:`strong reference` to object *o*, + indicating it is in use and should not be destroyed. This function is usually used to convert a :term:`borrowed reference` to a :term:`strong reference` in-place. The :c:func:`Py_NewRef` function can be used to create a new :term:`strong reference`. + When done using the object, release it by calling :c:func:`Py_DECREF`. + The object must not be ``NULL``; if you aren't sure that it isn't ``NULL``, use :c:func:`Py_XINCREF`. + Do not expect this function to actually modify *o* in any way. + .. c:function:: void Py_XINCREF(PyObject *o) - Increment the reference count for object *o*. The object may be ``NULL``, in - which case the macro has no effect. + Similar to :c:func:`Py_INCREF`, but the object *o* can be ``NULL``, + in which case this has no effect. See also :c:func:`Py_XNewRef`. .. c:function:: PyObject* Py_NewRef(PyObject *o) - Create a new :term:`strong reference` to an object: increment the reference - count of the object *o* and return the object *o*. + Create a new :term:`strong reference` to an object: + call :c:func:`Py_INCREF` on *o* and return the object *o*. When the :term:`strong reference` is no longer needed, :c:func:`Py_DECREF` - should be called on it to decrement the object reference count. + should be called on it to release the reference. The object *o* must not be ``NULL``; use :c:func:`Py_XNewRef` if *o* can be ``NULL``. @@ -67,9 +72,12 @@ objects. .. c:function:: void Py_DECREF(PyObject *o) - Decrement the reference count for object *o*. + Release a :term:`strong reference` to object *o*, indicating the + reference is no longer used. - If the reference count reaches zero, the object's type's deallocation + Once the last :term:`strong reference` is released + (i.e. the object's reference count reaches 0), + the object's type's deallocation function (which must not be ``NULL``) is invoked. This function is usually used to delete a :term:`strong reference` before @@ -78,6 +86,8 @@ objects. The object must not be ``NULL``; if you aren't sure that it isn't ``NULL``, use :c:func:`Py_XDECREF`. + Do not expect this function to actually modify *o* in any way. + .. warning:: The deallocation function can cause arbitrary Python code to be invoked (e.g. @@ -92,32 +102,35 @@ objects. .. c:function:: void Py_XDECREF(PyObject *o) - Decrement the reference count for object *o*. The object may be ``NULL``, in - which case the macro has no effect; otherwise the effect is the same as for - :c:func:`Py_DECREF`, and the same warning applies. + Similar to :c:func:`Py_DECREF`, but the object *o* can be ``NULL``, + in which case this has no effect. + The same warning from :c:func:`Py_DECREF` applies here as well. .. c:function:: void Py_CLEAR(PyObject *o) - Decrement the reference count for object *o*. The object may be ``NULL``, in + Release a :term:`strong reference` for object *o*. + The object may be ``NULL``, in which case the macro has no effect; otherwise the effect is the same as for :c:func:`Py_DECREF`, except that the argument is also set to ``NULL``. The warning for :c:func:`Py_DECREF` does not apply with respect to the object passed because the macro carefully uses a temporary variable and sets the argument to ``NULL`` - before decrementing its reference count. + before releasing the reference. - It is a good idea to use this macro whenever decrementing the reference - count of an object that might be traversed during garbage collection. + It is a good idea to use this macro whenever releasing a reference + to an object that might be traversed during garbage collection. .. c:function:: void Py_IncRef(PyObject *o) - Increment the reference count for object *o*. A function version of :c:func:`Py_XINCREF`. + Indicate taking a new :term:`strong reference` to object *o*. + A function version of :c:func:`Py_XINCREF`. It can be used for runtime dynamic embedding of Python. .. c:function:: void Py_DecRef(PyObject *o) - Decrement the reference count for object *o*. A function version of :c:func:`Py_XDECREF`. + Release a :term:`strong reference` to object *o*. + A function version of :c:func:`Py_XDECREF`. It can be used for runtime dynamic embedding of Python. diff --git a/Doc/c-api/sys.rst b/Doc/c-api/sys.rst index 34028d6c87e97b..a811eb17b328ea 100644 --- a/Doc/c-api/sys.rst +++ b/Doc/c-api/sys.rst @@ -8,8 +8,9 @@ Operating System Utilities .. c:function:: PyObject* PyOS_FSPath(PyObject *path) Return the file system representation for *path*. If the object is a - :class:`str` or :class:`bytes` object, then its reference count is - incremented. If the object implements the :class:`os.PathLike` interface, + :class:`str` or :class:`bytes` object, then a new + :term:`strong reference` is returned. + If the object implements the :class:`os.PathLike` interface, then :meth:`~os.PathLike.__fspath__` is returned as long as it is a :class:`str` or :class:`bytes` object. Otherwise :exc:`TypeError` is raised and ``NULL`` is returned. diff --git a/Doc/c-api/typeobj.rst b/Doc/c-api/typeobj.rst index 11f6034b9e330f..40cbcba63d69a6 100644 --- a/Doc/c-api/typeobj.rst +++ b/Doc/c-api/typeobj.rst @@ -688,7 +688,8 @@ and :c:data:`PyType_Type` effectively act as defaults.) } Finally, if the type is heap allocated (:c:macro:`Py_TPFLAGS_HEAPTYPE`), the - deallocator should decrement the reference count for its type object after + deallocator should release the owned reference to its type object + (via :c:func:`Py_DECREF`) after calling the type deallocator. In order to avoid dangling pointers, the recommended way to achieve this is: @@ -1397,9 +1398,10 @@ and :c:data:`PyType_Type` effectively act as defaults.) } The :c:func:`Py_CLEAR` macro should be used, because clearing references is - delicate: the reference to the contained object must not be decremented until + delicate: the reference to the contained object must not be released + (via :c:func:`Py_DECREF`) until after the pointer to the contained object is set to ``NULL``. This is because - decrementing the reference count may cause the contained object to become trash, + releasing the reference may cause the contained object to become trash, triggering a chain of reclamation activity that may include invoking arbitrary Python code (due to finalizers, or weakref callbacks, associated with the contained object). If it's possible for such code to reference *self* again, @@ -1477,7 +1479,7 @@ and :c:data:`PyType_Type` effectively act as defaults.) they may be C ints or floats). The third argument specifies the requested operation, as for :c:func:`PyObject_RichCompare`. - The return value's reference count is properly incremented. + The returned value is a new :term:`strong reference`. On error, sets an exception and returns ``NULL`` from the function. diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst index 8b4252f08dd4cd..3d6b4c49fa9c83 100644 --- a/Doc/c-api/unicode.rst +++ b/Doc/c-api/unicode.rst @@ -576,7 +576,7 @@ APIs: Copy an instance of a Unicode subtype to a new true Unicode object if necessary. If *obj* is already a true Unicode object (not a subtype), - return the reference with incremented refcount. + return a new :term:`strong reference` to the object. Objects other than Unicode or its subtypes will cause a :exc:`TypeError`. @@ -1537,11 +1537,11 @@ They all return ``NULL`` or ``-1`` if an exception occurs. Intern the argument *\*string* in place. The argument must be the address of a pointer variable pointing to a Python Unicode string object. If there is an existing interned string that is the same as *\*string*, it sets *\*string* to - it (decrementing the reference count of the old string object and incrementing - the reference count of the interned string object), otherwise it leaves - *\*string* alone and interns it (incrementing its reference count). - (Clarification: even though there is a lot of talk about reference counts, think - of this function as reference-count-neutral; you own the object after the call + it (releasing the reference to the old string object and creating a new + :term:`strong reference` to the interned string object), otherwise it leaves + *\*string* alone and interns it (creating a new :term:`strong reference`). + (Clarification: even though there is a lot of talk about references, think + of this function as reference-neutral; you own the object after the call if and only if you owned it before the call.) diff --git a/Doc/glossary.rst b/Doc/glossary.rst index 5c0f0f15217a07..81599477fc9534 100644 --- a/Doc/glossary.rst +++ b/Doc/glossary.rst @@ -168,8 +168,9 @@ Glossary :class:`str` objects. borrowed reference - In Python's C API, a borrowed reference is a reference to an object. - It does not modify the object reference count. It becomes a dangling + In Python's C API, a borrowed reference is a reference to an object, + where the code using the object does not own the reference. + It becomes a dangling pointer if the object is destroyed. For example, a garbage collection can remove the last :term:`strong reference` to the object and so destroy it. @@ -1131,8 +1132,10 @@ Glossary strong reference In Python's C API, a strong reference is a reference to an object - which increments the object's reference count when it is created and - decrements the object's reference count when it is deleted. + which is owned by the code holding the reference. The strong + reference is taken by calling :c:func:`Py_INCREF` when the + reference is created and released with :c:func:`Py_DECREF` + when the reference is deleted. The :c:func:`Py_NewRef` function can be used to create a strong reference to an object. Usually, the :c:func:`Py_DECREF` function must be called on diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst index b03860603c2862..641be05a16fb3c 100644 --- a/Doc/library/sys.rst +++ b/Doc/library/sys.rst @@ -759,6 +759,9 @@ always available. higher than you might expect, because it includes the (temporary) reference as an argument to :func:`getrefcount`. + Note that the returned value may not actually reflect how many + references to the object are actually held. Consequently, do not rely + on the returned value to be accurate, other than a value of 0 or 1. .. function:: getrecursionlimit()