-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warn for string cmp with new string warning assumptions #34
Warn for string cmp with new string warning assumptions #34
Conversation
Objects/stringobject.c
Outdated
@@ -1209,8 +1209,11 @@ string_richcompare(PyStringObject *a, PyStringObject *b, int op) | |||
PyObject *result; | |||
|
|||
/* Make sure both arguments are strings. */ | |||
if (!(PyString_Check(a) && PyString_Check(b))) { | |||
if (!(PyString_Check(a) && PyString_Check(b)) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the extra spaces are intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spaces removed
Objects/stringobject.c
Outdated
result = Py_NotImplemented; | ||
if (PyErr_WarnPy3k("comparing unicode and byte strings has different semantics in 3.x: convert the second string to byte.", 1) < 0) { | ||
goto out; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
goto out
indented 1 level too many?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the condition to the existing if.
Objects/stringobject.c
Outdated
result = Py_NotImplemented; | ||
if (PyErr_WarnPy3k("comparing unicode and byte strings has different semantics in 3.x: convert the second string to byte.", 1) < 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure but won't this warning trigger if b
is anything but a string (i.e. not just bytes)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hope this test clears any doubt:
>>> "test str" == u"test unicode"
__main__:1: DeprecationWarning: comparing unicode and byte strings has different semantics in 3.x: convert the second string to byte.
False
>>> u"test str" == "test unicode"
__main__:1: DeprecationWarning: comparing unicode and byte strings has different semantics in 3.x: convert the first string to bytes.
False
>>> "test str" == "test unicode"
False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And just to check: things like "abc" == 123
and "abc" == object()
don't go through this code path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They do actually with this logic, me checks notes on google drive again, isnt this the whole point of our assumption on tracking byteness? Please correct me if I confused something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember what we decided :) But partly because the warning says "comparing unicode and byte strings" it might be odd if the user gets that message when they've compared unicode and (say) an integer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess (but it is a guess!) is that if comparing "unicode with non-{unicode, bytes}" we don't want to print anything out, because such comparisons don't have changed semantics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After doing some tests, I agree, applies to unicode case below too I guess.
Will modify the check for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excluded these other types:
>>> "abc" == 123
False
>>> "abc" == object()
False
>>> u"abc" == 123
False
>>> u"abc" == object()
False
>>> u"test str" == "test unicode"
__main__:1: DeprecationWarning: comparing unicode and byte strings has different semantics in 3.x: convert the first string to bytes.
False
>>> "test str" == u"test unicode"
__main__:1: DeprecationWarning: comparing unicode and byte strings has different semantics in 3.x: convert the second string to byte.
False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an indirect call to Unicode compare through string compare though. I followed this and the only lead I have is the unicodecmp slot. I dont want to investigate it now but I have put it to my todo list. Slots are fragile and brittle, so taking this as design constraint for now until I get through some urgent things. My next string PR will hopefully handle, but there is a fix I first want to get in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See test to understand this.
0bee79a
to
21b3ff7
Compare
If this is in good shape, you can help merge so that I submit another fix PR. |
One comment (#34 (comment)) and then we're probably ready to go. |
Please squash. |
39ad149
to
29ae1a9
Compare
Done |
bb5d72c
No description provided.