Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-75459: Doc: C API: Improve object life cycle documentation #125962

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

rhansen
Copy link
Contributor

@rhansen rhansen commented Oct 25, 2024

  • Add "cyclic isolate" to the glossary.
  • Add a new "Object Life Cycle" page.
    • Illustrate the order of life cycle functions.
    • Document PyObject_CallFinalizer and PyObject_CallFinalizerFromDealloc.
  • PyObject_Init does not call tp_init.
  • PyObject_New:
    • also initializes the memory
    • does not call tp_alloc, tp_new, or tp_init
    • should not be used for GC-enabled objects
    • memory must be freed by PyObject_Free
  • PyObject_GC_New memory must be freed by PyObject_GC_Del.
  • Warn that garbage collector functions can be called from any thread.
  • tp_finalize and tp_clear:
    • Only called when there's a cyclic isolate.
    • Only one object in the cyclic isolate is finalized/cleared at a time.
    • Clearly warn that they might not be called.
    • They can optionally be manually called from tp_dealloc (via PyObject_CallFinalizerFromDealloc in the case of tp_finalize).
  • tp_finalize:
    • Reference object.__del__.
    • The finalizer can resurrect the object.
    • Suggest PyErr_GetRaisedException and PyErr_SetRaisedException instead of the deprecated PyErr_Fetch and PyErr_Restore functions.
    • Add links to PyErr_GetRaisedException and PyErr_SetRaisedException.
    • Suggest using PyErr_WriteUnraisable if an exception is raised during finalization.
    • Rename the example function from local_finalize to foo_finalize for consistency with the tp_dealloc documentation and as a hint that the name isn't special.
    • Minor wording and sylistic tweaks.
    • Warn that tp_finalize can be called during shutdown.

📚 Documentation preview 📚: https://cpython-previews--125962.org.readthedocs.build/

@rhansen
Copy link
Contributor Author

rhansen commented Oct 25, 2024

I'm not familiar enough with CPython's internals to be super confident about these changes. I would appreciate it if a GC expert would carefully review this.

Thanks!

@hugovk
Copy link
Member

hugovk commented Oct 25, 2024

Hmm, I'm not sure if we can require graphviz for the docs.

We'd have to consider installing it on the main docs server in addition to Read the Docs, and also make sure the docs can still build without it, for downstream redistributors who might only want to build with "vanilla" Sphinx and no extra extensions. Plus other developers would need an easy way to build the docs on their machines.

cc @AA-Turner

@rhansen rhansen force-pushed the docs branch 4 times, most recently from be22691 to d804c6c Compare October 25, 2024 09:10
@rhansen
Copy link
Contributor Author

rhansen commented Oct 25, 2024

Hmm, I'm not sure if we can require graphviz for the docs.

Maybe I should just commit the generated .svg (and the input dot file so it can be revised easily). Would that be acceptable?

  * Add "cyclic isolate" to the glossary.
  * Add a new "Object Life Cycle" page.
    * Illustrate the order of life cycle functions.
    * Document `PyObject_CallFinalizer` and
      `PyObject_CallFinalizerFromDealloc`.
  * `PyObject_Init` does not call `tp_init`.
  * `PyObject_New`:
    * also initializes the memory
    * does not call `tp_alloc`, `tp_new`, or `tp_init`
    * should not be used for GC-enabled objects
    * memory must be freed by `PyObject_Free`
  * `PyObject_GC_New` memory must be freed by `PyObject_GC_Del`.
  * Warn that garbage collector functions can be called from any
    thread.
  * `tp_finalize` and `tp_clear`:
    * Only called when there's a cyclic isolate.
    * Only one object in the cyclic isolate is finalized/cleared at a
      time.
    * Clearly warn that they might not be called.
    * They can optionally be manually called from `tp_dealloc` (via
      `PyObject_CallFinalizerFromDealloc` in the case of
      `tp_finalize`).
  * `tp_finalize`:
    * Reference `object.__del__`.
    * The finalizer can resurrect the object.
    * Suggest `PyErr_GetRaisedException` and
      `PyErr_SetRaisedException` instead of the deprecated
      `PyErr_Fetch` and `PyErr_Restore` functions.
    * Add links to `PyErr_GetRaisedException` and
      `PyErr_SetRaisedException`.
    * Suggest using `PyErr_WriteUnraisable` if an exception is raised
      during finalization.
    * Rename the example function from `local_finalize` to
      `foo_finalize` for consistency with the `tp_dealloc`
      documentation and as a hint that the name isn't special.
    * Minor wording and sylistic tweaks.
    * Warn that `tp_finalize` can be called during shutdown.
@rhansen
Copy link
Contributor Author

rhansen commented Oct 25, 2024

I committed the generated .svg so that the substance of this PR can be reviewed while we figure out if it is acceptable to add graphviz as a dependency. (Note that sphinx.ext.graphviz is a built-in extension, so enabling it doesn't add any new sphinx dependencies.)

@AA-Turner
Copy link
Member

I think that requiring graphviz should be fine -- Debian, Fedora, Gentoo, and OpenSUSE all package it. As Richard notes, it's a built-in extension, so should be fine from the "Vanilla" perspective.

I would want to include a NEWS entry to say that graphviz is now required to build the docs, though.

A

@AA-Turner AA-Turner added the docs Documentation in the Doc dir label Oct 25, 2024
@ZeroIntensity
Copy link
Member

My main concern with documenting nitty-gritty details of the lifecycle is that we're technically documenting implementation details, which are subject to change (and we've been bad at updating these kind of things from version-to-version in past). I suggest the SVG go into the InternalDocs folder instead.

It's also worth noting here that tp_finalize isn't 100% related to garbage collection, it's supposed to be used over tp_dealloc if complicated things are being done upon finalization, even for non-GC types. And while we're here, I think it would be a good idea to document the cases that tp_clear should exist for a tracked type.

@rhansen
Copy link
Contributor Author

rhansen commented Oct 26, 2024

My main concern with documenting nitty-gritty details of the lifecycle is that we're technically documenting implementation details, which are subject to change (and we've been bad at updating these kind of things from version-to-version in past). I suggest the SVG go into the InternalDocs folder instead.

I don't want to document any implementation details here, so I'm happy to remove what isn't necessary. It's hard to tell what is and isn't necessary because the end of an object's life is especially fraught with peril. I think that it is better to err on the side of over-documenting this topic than under-documenting.

I wrote this PR because there were several things that I needed to know that the existing documentation didn't make clear:

  • The order the tp_* functions might be called (to know what invariants are possible).
  • Which threads might execute the functions (for locking correctness).
  • Details about when a function is called (or not) that are necessary for locking correctness. (e.g., if multiple objects in the same cyclic isolate are never finalized concurrently then a lock-free design might be possible)
  • Approximately how often a tp_* function might be called: maybe never, exactly once, at most once, at least once, very frequently, etc.
  • Which other objects might be in an inconsistent state.

It's also worth noting here that tp_finalize isn't 100% related to garbage collection, it's supposed to be used over tp_dealloc if complicated things are being done upon finalization, even for non-GC types.

If I understand correctly, tp_finalize is never called for non-GC types unless the class designer calls it from tp_dealloc. In that case tp_finalize is just like any other helper function that might be called from tp_dealloc. (Maybe this is only true for static types and not heap types? I don't fully understand the difference.)

And while we're here, I think it would be a good idea to document the cases that tp_clear should exist for a tracked type.

I thought that was already sufficiently explained, even before this PR. Can you explain what you think is lacking?

@ZeroIntensity
Copy link
Member

First, thanks for doing this!

I think that it is better to err on the side of over-documenting this topic than under-documenting.

I don't, that limits our ability to modify the lifecycle in the future (especially because there's no good way to deprecate things here). I'll point this out when doing a more in-depth review though, I don't see anything particularly bad right now.

If I understand correctly, tp_finalize is never called for non-GC types unless the class designer calls it from tp_dealloc.

You're right, it's not, but I don't think we should limit ourselves to that in the future. It might be possible someday to automatically do this for untracked types as well. We should just document that all types, even GC, require PyObject_CallFinalizerFromDealloc in the destructor if they want tp_finalize to get eventually called--we can note that it could happen automatically, though.

I thought that was already sufficiently explained, even before this PR. Can you explain what you think is lacking?

Basically, it's not documented which types need to have a tp_clear, because not all GC types have it. I'm not even sure which cases require it. I think it's only needed if the type can have a direct reference cycle to itself? (As in, running its finalizer will try to Py_DECREF itself.)

Also, I don't think it should be documented that tp_clear is related to tp_dealloc by making an "optional call", that's sort of incidental. They tend to do the same thing, and the destructor can utilize the clear function for convenience, but they're for different purposes.


A few other notes:

  • It's fine to document what the specific allocators (e.g. PyObject_GC_New and PyObject_GC_Del) do, but we should point users to using tp_alloc and tp_free instead.
  • That said, I don't see the need to mention that functions like PyObject_New don't call tp_init. Those are strictly allocators, and currently documented as such.
  • This isn't exhaustive--static objects and some immortal objects don't follow this lifecycle (single-phase modules don't either, I think). Other objects might not follow this in the future either.
  • If I read "called from any thread" in the docs, I would be worried about holding the GIL. Maybe mention that while it can be called from any thread, it will still hold the GIL.

@rhansen
Copy link
Contributor Author

rhansen commented Nov 7, 2024

I did a major rewrite to hopefully address all of the feedback (thanks for reviewing!). Please take another look.

@ZeroIntensity ZeroIntensity self-requested a review November 8, 2024 12:33
@encukou
Copy link
Member

encukou commented Nov 8, 2024

Thank you very much for taking this on, @rhansen. I often wanted this documented better, to answer questions like:

  • Which functions can be called more than once?
  • Which can be skipped/bypassed?
  • Which should be called by Python itself, and which by the user?
  • Which functions can “see” a partially constructed/finalized object?
  • What's guaranteed, and what's an accident of the implementation that might change?

As you know, the API grew organically -- into quite a maze.

I don't think anyone has all of this in their head, so I think we need a two-stage review process here:

  • First, decide on the “shape” of the docs -- what kind of notes go here, what tooling/markup to use.
  • Then, do thorough fact-checking.

I'm in stage 1 now: what I write here might not necessarily be correct, I'll need to re-check it all later.


Maybe I should just commit the generated .svg (and the input dot file so it can be revised easily). Would that be acceptable?

I'd prefer that. Even if Graphviz is easy to install (and it not always is), it's an additional obstacle, unless you already have it.
And if the graph is checked in, we get to see it in diffs. We want that! (To check intended changes, but also to prevent a future Graphviz version, or an OS upgrade that uninstalls a font, making the graph unreadable.)

I didn't use .. image:: because that breaks the clickable parts of the image.

IMO, we can do without the clickable parts, especially since the explanation below should have the same links. Would you be OK with leaving this, as a HTML-specific enhancement, to a later PR?


tp_finalize should get a note that the interpreter never calls it more than once.
Then, PyObject_CallFinalizer* might get a note that this is why you should call them, rather than tp_finalize directly. (Correspondingly: if your tp_finalize is idempotent, you can just call it directly.)


Here's my take on the graph.

  • IMO, it needs a “direct call” arrow type in addition to a “happens after” one.
  • The PyObject_CallFinalizerFromDealloc recommendation would IMO work better in text, with a reference.

z

(click for source)
digraph G {
   compound=true
   graph [
     fontname="svg"
     fontsize=10.0
     layout="dot"
     ranksep=0.25
   ]
   node [
     fontname="Monospace"
     fontsize=10.0
   ]
   edge [
     fontsize=10.0
   ]

   subgraph cluster_type_call {
      label="Type call"
      shape=box
      {
        rank="same"
        "tp_new" [href="typeobj.html#c.PyTypeObject.tp_new"]
        "tp_alloc" [href="typeobj.html#c.PyTypeObject.tp_alloc"]
        "tp_new" -> "tp_alloc" [label="  calls  " style=bold arrowhead=open]
      }
      "tp_init" [href="typeobj.html#c.PyTypeObject.tp_init"]
      "tp_new" -> "tp_init" [label="  (tp_init can be  \n  bypassed \n or repeated) "]
   }
   "reachable" [fontname="Times-Italic" shape=box label=" object \n is in use "]
   {
     rank="same"
     "gc" [fontname="Times-Italic" shape=box label=" cyclic isolate \n detection "]
     "tp_traverse" [href="typeobj.html#c.PyTypeObject.tp_traverse"]
     "gc" -> "tp_traverse" [label="  calls  " style=bold arrowhead=open]
   }
   "is_finalized" [label=" tp_finalize \n called \n before? " shape=diamond fontname="Times-Italic" ]
   "tp_finalize" [href="typeobj.html#c.PyTypeObject.tp_finalize"]
   "reachable" -> "gc"
   "gc" -> "reachable" //[label=" still reachable "]
   "gc" -> "is_finalized" [label=" unreachable "]
   "is_finalized" -> "tp_finalize" [label="no"]
   "is_finalized" -> "tp_clear" [label="yes"]
   "tp_clear" [href="typeobj.html#c.PyTypeObject.tp_clear"]
   "tp_dealloc" [href="typeobj.html#c.PyTypeObject.tp_dealloc"]
   "tp_free" [href="typeobj.html#c.PyTypeObject.tp_free"]

   "tp_init" -> "reachable" [ltail=cluster_type_call]
   "tp_finalize" -> "tp_clear"
   "tp_finalize" -> "reachable" [label=" resurrected "]
   "tp_dealloc" -> "tp_finalize" [
     style=dashed
     arrowhead=open
     href="lifecycle.html#c.PyObject_CallFinalizerFromDealloc"
     label="recommended call\n(see note)"
   ]
   "tp_clear" -> "tp_dealloc"
   "garbage" [fontname="Times-Italic" shape=box label="uncollectable\n(gc.garbage)"]
   "tp_clear" -> "garbage"
   "tp_clear" -> "reachable" [label=" resurrected "]
   "tp_finalize" -> "tp_dealloc"
   "tp_dealloc" -> "tp_free" [label="  calls  " style=bold arrowhead=open]
   "reachable" -> "tp_dealloc" [label=" unused \n (refcount \n dropped \n to 0) "]
}

@rhansen rhansen marked this pull request as draft November 10, 2024 23:25
@rhansen
Copy link
Contributor Author

rhansen commented Nov 20, 2024

Thank you very much for taking this on, @rhansen.

Thank you for the thoughtful review @encukou!

Maybe I should just commit the generated .svg (and the input dot file so it can be revised easily). Would that be acceptable?

I'd prefer that. Even if Graphviz is easy to install (and it not always is), it's an additional obstacle, unless you already have it. And if the graph is checked in, we get to see it in diffs. We want that! (To check intended changes, but also to prevent a future Graphviz version, or an OS upgrade that uninstalls a font, making the graph unreadable.)

👍

I didn't use .. image:: because that breaks the clickable parts of the image.

IMO, we can do without the clickable parts, especially since the explanation below should have the same links.
Would you be OK with leaving this, as a HTML-specific enhancement, to a later PR?

I actually kinda like the current accessibility solution—it reuses the "Explanation" text below the image as the overall image description (at least that's the intention), and hovering over each component in the image shows alt text specific to that component. I'm not sure how to recreate that if we switch to .. image::.

Is the reason you want to switch to .. image:: because it's simpler and easier to review for correctness? Or have you noticed a problem with the current .. raw:: html?

@rhansen rhansen marked this pull request as ready for review November 20, 2024 08:43
@encukou
Copy link
Member

encukou commented Nov 20, 2024

Thank you! The new graph looks great at first glance. I'll get to a more thorough review this or next week.

have you noticed a problem with the current .. raw:: html?

The main issue is that it's HTML-only, so in PDF or EPUB it won't appear at all.

@ZeroIntensity
Copy link
Member

ZeroIntensity commented Nov 20, 2024

I'll do a review sometime today :)

Edit: Still working on the review, this is quite a big PR

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great PR! I have a lot of comments, though many of them are up for debate--if you have a different idea or don't want to change something, I'm happy to yield :)

@rhansen
Copy link
Contributor Author

rhansen commented Nov 27, 2024

Thank you so much for the thoughtful and thorough review, @ZeroIntensity! I'm a firm believer in code reviews, so I appreciate you taking the time to carefully go over everything.

I'm going to mark this as draft until I have completed another pass to address your feedback. I'll mark it as ready to review once I'm satisfied that it is ready for another look. In the meantime, feel free to add more comments if you think of anything.

@rhansen rhansen marked this pull request as ready for review December 1, 2024 10:20
@ZeroIntensity
Copy link
Member

I'll start the second pass today, but it will definitely take a few days for me to finish the whole review.

Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay.

I've not been able to go through the entire PR, but I'm sending the comments I already wrote. I hope they make sense.

I'll make another pass later.

Comment on lines +1 to +41
#life_events_graph {
/*
* Unfortunately these colors don't seem to be exposed in any of the theme's
* variables, so they are manually copied here.
*/
--svg-light-fgcolor: black;
--svg-light-bgcolor: white;
--svg-dark-fgcolor: rgba(255, 255, 255, 0.87);
--svg-dark-bgcolor: #222;
--svg-fgcolor: var(--svg-light-fgcolor);
--svg-bgcolor: var(--svg-light-bgcolor);
}
@media (prefers-color-scheme: dark) {
#life_events_graph {
--svg-fgcolor: var(--svg-dark-fgcolor);
--svg-bgcolor: var(--svg-dark-bgcolor);
}
}
@media (prefers-color-scheme: light) {
#life_events_graph {
--svg-fgcolor: var(--svg-light-fgcolor);
--svg-bgcolor: var(--svg-light-bgcolor);
}
}
:root:has(#pydoctheme_dark_css[media="not all"]) #life_events_graph {
--svg-fgcolor: var(--svg-light-fgcolor);
--svg-bgcolor: var(--svg-light-bgcolor);
}
:root:has(#pydoctheme_dark_css[media="all"]) #life_events_graph {
--svg-fgcolor: var(--svg-dark-fgcolor);
--svg-bgcolor: var(--svg-dark-bgcolor);
}
#life_events_graph [stroke="black"] {
stroke: var(--svg-fgcolor);
}
#life_events_graph :is(text, [fill="black"]) {
fill: var(--svg-fgcolor);
}
#life_events_graph :is([fill="white"]) {
fill: var(--svg-bgcolor);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified using currentcolor & transparent:

Suggested change
#life_events_graph {
/*
* Unfortunately these colors don't seem to be exposed in any of the theme's
* variables, so they are manually copied here.
*/
--svg-light-fgcolor: black;
--svg-light-bgcolor: white;
--svg-dark-fgcolor: rgba(255, 255, 255, 0.87);
--svg-dark-bgcolor: #222;
--svg-fgcolor: var(--svg-light-fgcolor);
--svg-bgcolor: var(--svg-light-bgcolor);
}
@media (prefers-color-scheme: dark) {
#life_events_graph {
--svg-fgcolor: var(--svg-dark-fgcolor);
--svg-bgcolor: var(--svg-dark-bgcolor);
}
}
@media (prefers-color-scheme: light) {
#life_events_graph {
--svg-fgcolor: var(--svg-light-fgcolor);
--svg-bgcolor: var(--svg-light-bgcolor);
}
}
:root:has(#pydoctheme_dark_css[media="not all"]) #life_events_graph {
--svg-fgcolor: var(--svg-light-fgcolor);
--svg-bgcolor: var(--svg-light-bgcolor);
}
:root:has(#pydoctheme_dark_css[media="all"]) #life_events_graph {
--svg-fgcolor: var(--svg-dark-fgcolor);
--svg-bgcolor: var(--svg-dark-bgcolor);
}
#life_events_graph [stroke="black"] {
stroke: var(--svg-fgcolor);
}
#life_events_graph :is(text, [fill="black"]) {
fill: var(--svg-fgcolor);
}
#life_events_graph :is([fill="white"]) {
fill: var(--svg-bgcolor);
}
#life_events_graph {
--svg-fgcolor: currentcolor;
--svg-bgcolor: transparent;
}
#life_events_graph a {
color: inherit;
}
#life_events_graph [stroke="black"] {
stroke: var(--svg-fgcolor);
}
#life_events_graph text,
#life_events_graph [fill="black"] {
fill: var(--svg-fgcolor);
}
#life_events_graph [fill="white"] {
fill: var(--svg-bgcolor);
}
#life_events_graph [fill="none"] {
/* On links, setting fill will make the entire shape clickable */
fill: var(--svg-bgcolor);
}

Comment on lines +61 to +62
labeltooltip="start to tp_new: type call"
tooltip="start to tp_new: type call"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove labeltooltip and tooltip in the whole diagram. They don't add any information, but they do make non-link text underlined on mouse hover.

Suggested change
labeltooltip="start to tp_new: type call"
tooltip="start to tp_new: type call"

slot). Specifically, this function does **not** call the object's
:meth:`!__init__` method.

.. warning::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer downgrading these to note. These functions are not dangerous.
IMO, from the name it would be more reasonable to think that this calls tp_new or tp_init (thus perhaps setting up some important invariants) than to assume it zeroes memory.

Suggested change
.. warning::
.. note::

(And similar for all the warnings in this PR.)

object.
#. :c:member:`~PyTypeObject.tp_init` initializes the newly created object.
:c:member:`!tp_init` can be called again to re-initialize an object, if
desired.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tp_new can be skipped:

Suggested change
desired.
desired. The :c:member:`!tp_init` call can also be skipped entirely,
for example by Python code calling :py:meth:`~object.__new__`.

This also needs an arrow in the diagram.

desired.

* After :c:member:`!tp_init` completes, the object is ready to use.
* After the last reference to an object is removed:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* After the last reference to an object is removed:
* Some time after the last reference to an object is removed:

Comment on lines +105 to +108
*resurrected*, preventing its pending destruction. (Only
:c:member:`!tp_finalize` is allowed to resurrect an object;
:c:member:`~PyTypeObject.tp_clear` and
:c:member:`~PyTypeObject.tp_dealloc` cannot.) Resurrecting an object may
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would make the text contradictory: if tp_finalize may resurrect an object but tp_dealloc may not, then tp_dealloc may call PyObject_CallFinalizerFromDealloc (which calls tp_finalize).

AFAIK, any of these may resurrect the object.

cyclic isolate
A subgroup of one or more objects that reference each other in a reference
cycle, but are not referenced by objects outside the group. The goal of
the garbage collector is to identify these groups and break the reference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the garbage collector is to identify these groups and break the reference
the :term:`cyclic garbage collector <garbage collection>` is to identify these groups and break the reference

Comment on lines +115 to +121
* :c:member:`~PyTypeObject.tp_dealloc` can optionally call
:c:member:`~PyTypeObject.tp_finalize` via
:c:func:`PyObject_CallFinalizerFromDealloc` if it wishes to reuse that
code to help with object destruction. This is recommended because it
guarantees that :c:member:`!tp_finalize` is always called before
destruction. See the :c:member:`~PyTypeObject.tp_dealloc` documentation
for example code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tp_clear documentation also suggests that tp_dealloc can call tp_clear, to avoid code duplication.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK:

For heap types (ones defined in Python code, and ones defined via PyType_From* functions), tp_dealloc defaults to a function called subtype_dealloc, which does the whole dance: calls PyObject_CallFinalizerFromDealloc and PyObject_GC_UnTrack, __dict__ and delegates to a supertype's dealloc.

For these types, we should, strongly recommend to not implement your own tp_dealloc.

tp_dealloc is what's set when a user defines __del__ in Python code. That's why it's important to call it once, and always: users expect their custom code to run!

However, it's not that important to call in on types you control, where you know users can't provide __del__ -- that is, on immutable types (Py_TPFLAGS_IMMUTABLETYPE).

Note that all static (i.e. non-heap) types are immutable.

So, the recommendation to call PyObject_CallFinalizerFromDealloc is only applicable when you're defining a heap type with custom tp_dealloc -- which should not be recommended.

If you're looking to write tp_finalize as a C function, a better recommendation would, IMO, be to instead put the clearing code in tp_clear, and call that from tp_dealloc (making sure that your tp_clear is idempotent). I see this PR already makes that recommendation.

Comment on lines +1549 to +1550
is only called when no (strong) references exist, and is thus able to
safely destroy the object itself.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact it must destroy the object (unless it was resurrected).

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for dropping the ball on my part--we're getting to a point where I'm willing to put the first checkmark on it!

I've noticed that a lot of information is repeated. It's good to be clear, but redudancy can be a bit frustrating on our end, because we need to update more places if we want to change something.

Comment on lines +19 to +22
not initialized. Despite its name, this function is unrelated to the
object's :meth:`~object.__init__` method (:c:member:`~PyTypeObject.tp_init`
slot). Specifically, this function does **not** call the object's
:meth:`!__init__` method.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth mentioning that you should just use tp_alloc and not touch this (say it's a "low-level routine").

:c:func:`PyObject_Init`. The caller will own the only reference to the
object (i.e. its reference count will be one).

Do not call this directly to allocate memory for an object; call the type's
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, "do not" is a strong phrase. Nothing bad will happen if you do it, it's just not recommended.

@@ -136,6 +178,21 @@ rules:
Releases memory allocated to an object using :c:macro:`PyObject_GC_New` or
:c:macro:`PyObject_GC_NewVar`.

Do not call this directly to free an object's memory; call the type's
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm not a fan of "do not" here again. This "do not" is not nearly the severity of the case below, in which you would get a segfault.

Comment on lines +89 to +91
:c:member:`~PyTypeObject.tp_finalize` function. Python currently does
not finalize an object when the last reference to it is deleted, but
this may change in the future.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add something like "(unless PyObject_CallFinalizerFromDealloc is used in the destructor)"

held by the object. Python currently does not clear an object in
response to the deletion of the last reference, but this may change in
the future.
#. :c:member:`~PyTypeObject.tp_dealloc` is called to destroy the object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we want to say that it's good practice to just have tp_dealloc call into the tp_clear slot?


* The :c:member:`~PyTypeObject.tp_finalize` function is permitted to add a
reference to the object if desired. If it does, the object is
*resurrected*, preventing its pending destruction. (Only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is "resurrected" italicized?

*resurrected*, preventing its pending destruction. (Only
:c:member:`!tp_finalize` is allowed to resurrect an object;
:c:member:`~PyTypeObject.tp_clear` and
:c:member:`~PyTypeObject.tp_dealloc` cannot.) Resurrecting an object may
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, tp_dealloc can through PyObject_CallFinalizerFromDealloc.

initialization by :c:member:`~PyTypeObject.tp_init`. To construct a
fully initialized object, call *typeobj* instead. For example::

PyObject *foo = PyObject_CallNoArgs((PyObject *)&PyFoo_Type);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth using an example of a real type here, primarily because not all types are available via PySomething_Type.

Suggested change
PyObject *foo = PyObject_CallNoArgs((PyObject *)&PyFoo_Type);
PyObject *list_instance = PyObject_CallNoArgs((PyObject *)&PyList_Type);

:c:member:`~PyTypeObject.tp_traverse` to identify :term:`cyclic isolates
<cyclic isolate>`.
* When the garbage collector discovers a :term:`cyclic isolate`, it
finalizes one of the objects in the group by marking it as *finalized* and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I've already asked this, but why is this italicized?


Listed below are the stages of life of a hypothetical :term:`cyclic isolate`
that continues to exist after each member object is finalized or cleared. It
is a bug if a cyclic isolate progresses through all of these stages; it should
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"memory leak" instead of "bug" is more informative:

Suggested change
is a bug if a cyclic isolate progresses through all of these stages; it should
is a memory leak if a cyclic isolate progresses through all of these stages; it should

@ZeroIntensity
Copy link
Member

@rhansen Are you still planning on working on this? If not, I'm happy to take over.

@encukou
Copy link
Member

encukou commented Mar 11, 2025

Updating to resolve a conflict with GH-129850.

IMO, the warnings about PyObject_New[Var] not being usable with Py_TPFLAGS_HAVE_GC should have at least as strong as the ones about not initializing memory, but, I posted that out before. I didn't want to start reorganizing the PR.

@rhansen, do you have any work in progress on this?
This is stuff that can burn a person out; don't feel bad if you want to delegate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting review docs Documentation in the Doc dir skip news
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

6 participants