Warn for string cmp with new string warning assumptions #34

nanjekyejoannah · 2023-12-12T19:08:05Z

No description provided.

ltratt · 2023-12-15T22:37:21Z

Objects/stringobject.c

@@ -1209,8 +1209,11 @@ string_richcompare(PyStringObject *a, PyStringObject *b, int op)
    PyObject *result;

    /* Make sure both arguments are strings. */
-    if (!(PyString_Check(a) && PyString_Check(b))) {
+     if (!(PyString_Check(a) && PyString_Check(b)) ) {


I don't think the extra spaces are intended?

Spaces removed

ltratt · 2023-12-15T22:37:35Z

Objects/stringobject.c

        result = Py_NotImplemented;
+        if (PyErr_WarnPy3k("comparing unicode and byte strings has different semantics in 3.x: convert the second string to byte.", 1) < 0) {
+                goto out;


goto out indented 1 level too many?

I have added the condition to the existing if.

ltratt · 2023-12-15T22:38:31Z

Objects/stringobject.c

        result = Py_NotImplemented;
+        if (PyErr_WarnPy3k("comparing unicode and byte strings has different semantics in 3.x: convert the second string to byte.", 1) < 0) {


I'm not sure but won't this warning trigger if b is anything but a string (i.e. not just bytes)?

Hope this test clears any doubt:

>>> "test str" == u"test unicode" __main__:1: DeprecationWarning: comparing unicode and byte strings has different semantics in 3.x: convert the second string to byte. False >>> u"test str" == "test unicode" __main__:1: DeprecationWarning: comparing unicode and byte strings has different semantics in 3.x: convert the first string to bytes. False >>> "test str" == "test unicode" False

And just to check: things like "abc" == 123 and "abc" == object() don't go through this code path?

They do actually with this logic, me checks notes on google drive again, isnt this the whole point of our assumption on tracking byteness? Please correct me if I confused something.

I don't remember what we decided :) But partly because the warning says "comparing unicode and byte strings" it might be odd if the user gets that message when they've compared unicode and (say) an integer?

My guess (but it is a guess!) is that if comparing "unicode with non-{unicode, bytes}" we don't want to print anything out, because such comparisons don't have changed semantics?

After doing some tests, I agree, applies to unicode case below too I guess.
Will modify the check for both.

Excluded these other types:

>>> "abc" == 123 False >>> "abc" == object() False >>> u"abc" == 123 False >>> u"abc" == object() False >>> u"test str" == "test unicode" __main__:1: DeprecationWarning: comparing unicode and byte strings has different semantics in 3.x: convert the first string to bytes. False >>> "test str" == u"test unicode" __main__:1: DeprecationWarning: comparing unicode and byte strings has different semantics in 3.x: convert the second string to byte. False

There is an indirect call to Unicode compare through string compare though. I followed this and the only lead I have is the unicodecmp slot. I dont want to investigate it now but I have put it to my todo list. Slots are fragile and brittle, so taking this as design constraint for now until I get through some urgent things. My next string PR will hopefully handle, but there is a fix I first want to get in.

See test to understand this.

nanjekyejoannah · 2023-12-25T16:46:26Z

If this is in good shape, you can help merge so that I submit another fix PR.

ltratt · 2023-12-25T16:49:53Z

One comment (#34 (comment)) and then we're probably ready to go.

ltratt · 2023-12-27T18:06:30Z

Please squash.

nanjekyejoannah · 2023-12-30T02:32:10Z

Done

ltratt self-assigned this Dec 15, 2023

ltratt reviewed Dec 15, 2023

View reviewed changes

nanjekyejoannah force-pushed the warn_string_cmp_with_recent_assumptions branch from 0bee79a to 21b3ff7 Compare December 25, 2023 16:41

Warn for string cmp with new string warning assumptions

29ae1a9

nanjekyejoannah force-pushed the warn_string_cmp_with_recent_assumptions branch from 39ad149 to 29ae1a9 Compare December 30, 2023 02:31

ltratt added this pull request to the merge queue Dec 30, 2023

Merged via the queue into softdevteam:regression_fix with commit bb5d72c Dec 30, 2023
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warn for string cmp with new string warning assumptions #34

Warn for string cmp with new string warning assumptions #34

nanjekyejoannah commented Dec 12, 2023

ltratt Dec 15, 2023

nanjekyejoannah Dec 25, 2023

ltratt Dec 15, 2023

nanjekyejoannah Dec 25, 2023

ltratt Dec 15, 2023

nanjekyejoannah Dec 25, 2023

ltratt Dec 25, 2023

nanjekyejoannah Dec 25, 2023

ltratt Dec 25, 2023

ltratt Dec 25, 2023

nanjekyejoannah Dec 25, 2023

nanjekyejoannah Dec 27, 2023

nanjekyejoannah Dec 27, 2023

nanjekyejoannah Dec 27, 2023

nanjekyejoannah commented Dec 25, 2023

ltratt commented Dec 25, 2023

ltratt commented Dec 27, 2023

nanjekyejoannah commented Dec 30, 2023

		result = Py_NotImplemented;
		if (PyErr_WarnPy3k("comparing unicode and byte strings has different semantics in 3.x: convert the second string to byte.", 1) < 0) {

Warn for string cmp with new string warning assumptions #34

Warn for string cmp with new string warning assumptions #34

Conversation

nanjekyejoannah commented Dec 12, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nanjekyejoannah commented Dec 25, 2023

ltratt commented Dec 25, 2023

ltratt commented Dec 27, 2023

nanjekyejoannah commented Dec 30, 2023