-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-99593: Add tests for Unicode C API (part 2) #99868
gh-99593: Add tests for Unicode C API (part 2) #99868
Conversation
Add tests for lower-level functions.
Lib/test/test_capi/test_unicode.py
Outdated
@@ -16,6 +16,286 @@ class Str(str): | |||
|
|||
|
|||
class CAPITest(unittest.TestCase): | |||
# TODO: Test the following function: | |||
# | |||
# PyUnicode_ClearFreeList |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyUnicode_ClearFreeList was removed in Python 3.9!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, these tests were written when it was here.
self.assertEqual(new(0, maxchar), '') | ||
self.assertEqual(new(5, maxchar), chr(maxchar)*5) | ||
self.assertEqual(new(0, 0x110000), '') | ||
self.assertRaises(SystemError, new, 5, 0x110000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this error should become a ValueError, but it can be changed outside this PR, since you seem to want to backport these news tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that SystemError is better exception type here. It is a misuse of the C API, you cannot get this error from Python code.
if (!result) { | ||
return NULL; | ||
} | ||
if (size > 0 && maxchar <= 0x10ffff && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds dangerous to return a string to the "Python space" if characters are not initialized when maxchar is greated than 0x10ffff. Can you remove maxchar <= 0x10ffff
condition? PyUnicode_Fill() must fail if the fill character is too big (greater than 0x10ffff).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a test for PyUnicode_New()
, not for PyUnicode_Fill()
. We should see exceptions raised by PyUnicode_New()
, not PyUnicode_Fill()
.
It never happens, because PyUnicode_New()
returns NULL if size > 0 and maxchar > 0x10ffff. If it will not return NULL, it is better to get a malformed string than get an exception raised by PyUnicode_Fill()
and think that it was raised by PyUnicode_New()
.
} | ||
|
||
result = PyUnicode_WriteChar(to_copy, index, (Py_UCS4)character); | ||
if (result == -1 && PyErr_Occurred()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me, -1 means an error. You shouldn't have to check if an exception was raised or not. Or it should become: assert(PyErr_Occurred()).
Same remark for other test wrapper like the one for PyUnicode_Resize().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If an exception was raised, but result is not -1, then what? The code will return NULL, and we will never know that something wrong happened with PyUnicode_WriteChar()
.
I do not use the C assert() in these tests, because I want to make tests working even in non-debug build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not use the C assert() in these tests, because I want to make tests working even in non-debug build.
C assert() are enabled on release (no-debug) builds: see the following code in Modules/_testcapi/parts.h:
// Always enable assertions
#undef NDEBUG
Many _testcapi tests are only implemented with assert()
.
Modules/_testcapi/unicode.c
Outdated
@@ -186,57 +536,126 @@ unicode_asucs4(PyObject *self, PyObject *args) | |||
buffer[str_len] = 0xffffU; | |||
|
|||
if (!PyUnicode_AsUCS4(unicode, buffer, buf_len, copy_null)) { | |||
PyMem_Free(buffer); | |||
PyMem_FREE(buffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyMem_FREE is a deprecated alias to PyMem_Free(). I would expect the opposite change, replace PyMem_FREE with PyMem_Free :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted. The code in my branch was written a long time ago, when PyMem_FREE and PyMem_Free were different things. I just copied it over the current code.
self.assertRaises(SystemError, append, 'abc', NULL) | ||
# TODO: Test PyUnicode_Append() with modifiable unicode | ||
# and with NULL as the address. | ||
# TODO: Check reference counts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's done automatically by Refleaks buildbots, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. But since this C API does unusual things with reference counts, it would be better to test it explicitly, if possible.
Lib/test/test_capi/test_unicode.py
Outdated
if SIZEOF_WCHAR_T == 2: | ||
if sys.byteorder == 'little': | ||
encoding = 'utf-16le' | ||
elif sys.byteorder == 'little': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition looks wrong. Maybe just use else:
? I don't think that Python supports other endianness. Same a few lines below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I though that explicit check can be better, but made an error in it.
Thank you for your review Victor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for my addressing my review.
} | ||
|
||
result = PyUnicode_WriteChar(to_copy, index, (Py_UCS4)character); | ||
if (result == -1 && PyErr_Occurred()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not use the C assert() in these tests, because I want to make tests working even in non-debug build.
C assert() are enabled on release (no-debug) builds: see the following code in Modules/_testcapi/parts.h:
// Always enable assertions
#undef NDEBUG
Many _testcapi tests are only implemented with assert()
.
@serhiy-storchaka Victor approved this, ready to merge? (I removed the 3.10 backport label, it's now security only) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The CI test "DO-NOT-MERGE / unresolved review" was blocked on "Waiting for status to be reported". I tried to update the PR on main to see if it does unblock the PR. |
Thanks @serhiy-storchaka for the PR, and @vstinner for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11. |
Sorry, @serhiy-storchaka and @vstinner, I could not cleanly backport this to |
Merged, thanks @serhiy-storchaka. @serhiy-storchaka: If you consider that this change should be backported to Python 3.11, go ahead. But the automated backport failed for an unknown reason. |
* main: (61 commits) pythongh-64595: Argument Clinic: Touch source file if any output file changed (python#104152) pythongh-64631: Test exception messages in cloned Argument Clinic funcs (python#104167) pythongh-68395: Avoid naming conflicts by mangling variable names in Argument Clinic (python#104065) pythongh-64658: Expand Argument Clinic return converter docs (python#104175) pythonGH-103092: port `_asyncio` freelist to module state (python#104196) pythongh-104051: fix crash in test_xxtestfuzz with -We (python#104052) pythongh-104190: fix ubsan crash (python#104191) pythongh-104106: Add gcc fallback of mkfifoat/mknodat for macOS (pythongh-104129) pythonGH-104142: Fix _Py_RefcntAdd to respect immortality (pythonGH-104143) pythongh-104112: link from cached_property docs to method-caching FAQ (python#104113) pythongh-68968: Correcting message display issue with assertEqual (python#103937) pythonGH-103899: Provide a hint when accidentally calling a module (pythonGH-103900) pythongh-103963: fix 'make regen-opcode' in out-of-tree builds (python#104177) pythongh-102500: Add PEP 688 and 698 to the 3.12 release highlights (python#104174) pythonGH-81079: Add case_sensitive argument to `pathlib.Path.glob()` (pythonGH-102710) pythongh-91896: Deprecate collections.abc.ByteString (python#102096) pythongh-99593: Add tests for Unicode C API (part 2) (python#99868) pythongh-102500: Document PEP 688 (python#102571) pythongh-102500: Implement PEP 688 (python#102521) pythongh-96534: socketmodule: support FreeBSD divert(4) socket (python#96536) ...
Thanks @serhiy-storchaka for the PR, and @vstinner for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12. |
Sorry, @serhiy-storchaka and @vstinner, I could not cleanly backport this to |
Add tests for lower-level functions.