-
-
Notifications
You must be signed in to change notification settings - Fork 31.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-98712: Clarify "readonly bytes-like object" semantics in C arg-parsing docs #98710
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -34,24 +34,39 @@ These formats allow accessing an object as a contiguous chunk of memory. | |
You don't have to provide raw storage for the returned unicode or bytes | ||
area. | ||
|
||
In general, when a format sets a pointer to a buffer, the buffer is | ||
managed by the corresponding Python object, and the buffer shares | ||
the lifetime of this object. You won't have to release any memory yourself. | ||
The only exceptions are ``es``, ``es#``, ``et`` and ``et#``. | ||
|
||
However, when a :c:type:`Py_buffer` structure gets filled, the underlying | ||
buffer is locked so that the caller can subsequently use the buffer even | ||
inside a :c:type:`Py_BEGIN_ALLOW_THREADS` block without the risk of mutable data | ||
being resized or destroyed. As a result, **you have to call** | ||
:c:func:`PyBuffer_Release` after you have finished processing the data (or | ||
in any early abort case). | ||
|
||
Unless otherwise stated, buffers are not NUL-terminated. | ||
|
||
Some formats require a read-only :term:`bytes-like object`, and set a | ||
pointer instead of a buffer structure. They work by checking that | ||
the object's :c:member:`PyBufferProcs.bf_releasebuffer` field is ``NULL``, | ||
which disallows mutable objects such as :class:`bytearray`. | ||
There are three ways strings and buffers can be converted to C: | ||
|
||
* Formats such as ``y*`` and ``s*`` fill a :c:type:`Py_buffer` structure. | ||
This locks the underlying buffer so that the caller can subsequently use | ||
the buffer even inside a :c:type:`Py_BEGIN_ALLOW_THREADS` | ||
block without the risk of mutable data being resized or destroyed. | ||
As a result, **you have to call** :c:func:`PyBuffer_Release` after you have | ||
finished processing the data (or in any early abort case). | ||
|
||
* The ``es``, ``es#``, ``et`` and ``et#`` formats allocate the result buffer. | ||
**You have to call** :c:func:`PyMem_Free` after you have finished | ||
processing the data (or in any early abort case). | ||
|
||
* .. _c-arg-borrowed-buffer: | ||
|
||
Other formats take a read-only :term:`bytes-like object`, such as | ||
:class:`bytes` or :class:`str`, and provide a ``const char *`` pointer to | ||
its buffer. | ||
In this case the buffer is "borrowed": it is managed by the corresponding | ||
Python object, and shares the lifetime of this object. | ||
You won't have to release any memory yourself. | ||
|
||
To ensure that the underlying buffer may be safely borrowed, the object's | ||
:c:member:`PyBufferProcs.bf_releasebuffer` field must be ``NULL``. | ||
This disallows common mutable objects such as :class:`bytearray`, | ||
but also some read-only objects such as :class:`memoryview` of | ||
:class:`bytes`. | ||
|
||
Besides this ``bf_releasebuffer`` requirement, the functions do not verify | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Functions? What functions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The argument-parsing functions. |
||
whether the input object is immutable (e.g. whether it would honor a request | ||
for a writable buffer, or whether another thread can mutate the data). | ||
|
||
.. note:: | ||
|
||
|
@@ -89,7 +104,7 @@ which disallows mutable objects such as :class:`bytearray`. | |
Unicode objects are converted to C strings using ``'utf-8'`` encoding. | ||
|
||
``s#`` (:class:`str`, read-only :term:`bytes-like object`) [const char \*, :c:type:`Py_ssize_t`] | ||
Like ``s*``, except that it doesn't accept mutable objects. | ||
Like ``s*``, except that it provides a :ref:`borrowed buffer <c-arg-borrowed-buffer>`. | ||
The result is stored into two C variables, | ||
the first one a pointer to a C string, the second one its length. | ||
The string may contain embedded null bytes. Unicode objects are converted | ||
|
@@ -108,8 +123,9 @@ which disallows mutable objects such as :class:`bytearray`. | |
pointer is set to ``NULL``. | ||
|
||
``y`` (read-only :term:`bytes-like object`) [const char \*] | ||
This format converts a bytes-like object to a C pointer to a character | ||
string; it does not accept Unicode objects. The bytes buffer must not | ||
This format converts a bytes-like object to a C pointer to a | ||
:ref:`borrowed <c-arg-borrowed-buffer>` character string; | ||
it does not accept Unicode objects. The bytes buffer must not | ||
contain embedded null bytes; if it does, a :exc:`ValueError` | ||
exception is raised. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure about
str
? They
description below explicitly states "it does not accept Unicode objects", and it doesn't make sense to me that a buffer code would acceptstr
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But
s
ands#
acceptstr
.I think there is a wrong punctuation here. I would write either
a read-only :term:`bytes-like object`, such as :class:`bytes`, or :class:`str`
(note the second comma)
or
a read-only :term:`bytes-like object` (such as :class:`bytes`) or :class:`str`
But I do not know what is correct and looks better in English.
Or maybe change the order to avoid confusion?
:class:`str` or a read-only :term:`bytes-like object`, such as :class:`bytes`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that makes sense. I think putting
str
first (your last suggestion) would make this more clear.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch!