-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict Pandas merge suffixes param type to list/tuple to avoid interchange in right and left suffix order #34208
Conversation
Thanks @PuneethaPai for the PR. can you explain the changes? why not just validate in |
HI @simonjayhawkins ,
did you refer to static checking tools for python like pylint, pyflake, etc?
I think both list/tuple needs length check. I haven't done length checking explicitly. Relying on python to through error. Example: In [16]: a, b = (1,2,3)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-16-22772806e702> in <module>
----> 1 a, b = (1,2,3)
ValueError: too many values to unpack (expected 2)
In [17]: a, b = [1, 2, 3]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-17-c702cf699c19> in <module>
----> 1 a, b = [1, 2, 3]
ValueError: too many values to unpack (expected 2)
In [18]: a, b = [1]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-18-605f28859c6e> in <module>
----> 1 a, b = [1]
ValueError: not enough values to unpack (expected 2, got 1) I have added tests for these error messages that's all. |
the docs clearly state tuple, so I would be ok with changing the tests. need to wait and see what others think.
again docs clearly state tuple, so could be regarded as a user error, but ok with validating parameters and raising exception to avoid unexpected results.
sgtm
it seems logical to me minimise the changes to validate prior to the code removed, L670 in pandas/core/reshape/merge.py The other changes do reduce the number of parameters passed around, but is that cleanup required here. My preference is to keep PRs atomic where possible. again wait and see what others think.
thinking mypy ( and end users using mypy)
sgtm. |
847dd16
to
466b175
Compare
Hello @PuneethaPai! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-06-06 04:03:24 UTC |
@simonjayhawkins or @jreback ,
flake8 as you can see above asks me to add white space after But black doesn't like this. black --diff pandas/tests/reshape/merge/test_merge.py
--- pandas/tests/reshape/merge/test_merge.py 2020-05-20 03:05:31 +0000
+++ pandas/tests/reshape/merge/test_merge.py 2020-05-20 03:14:29.050606 +0000
@@ -2054,11 +2054,11 @@
tm.assert_frame_equal(result, expected)
@pytest.mark.parametrize(
"col1, col2, suffixes",
- [("a", "a", (None, None)), ("a", "a", ("", None)), (0, 0, (None, "")), ],
+ [("a", "a", (None, None)), ("a", "a", ("", None)), (0, 0, (None, "")),],
)
def test_merge_suffix_error(col1, col2, suffixes):
# issue: 24782
a = pd.DataFrame({col1: [1, 2, 3]})
b = pd.DataFrame({col2: [3, 4, 5]})
reformatted pandas/tests/reshape/merge/test_merge.py
All done! ✨ 🍰 ✨
1 file reformatted. As you can see it will delete white space. Can anyone of you suggest fix for this? |
d67fc79
to
49c5c7a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @PuneethaPai for working on this. generally lgtm.
There is a ci failure in the doc build
in doc\source\user_guide\merging.rst
result = pd.merge(left, right, on='k', suffixes=['_l', '_r'])
needs to be changed also
49c5c7a
to
22cc686
Compare
Thanks @simonjayhawkins . I wasn't able to debug the issue as I couldn't run that ci stage in my local. Thanks for identifying bug. |
@PuneethaPai I think we are going to need a note in the Backwards incompatible API changes |
22cc686
to
6cf9ee6
Compare
6cf9ee6
to
4592838
Compare
Co-authored-by: Simon Hawkins <simonjayhawkins@gmail.com>
4592838
to
b060059
Compare
Hi, @simonjayhawkins and @jreback Thanks |
thanks @PuneethaPai nice! |
This is breaking GeoPandas (where we are using list and not tuple).
That's not fully correct. The In general, I am certainly fine with restricting the set of accepted types (eg to avoid the confusing issue with sets, the original report), but this is also a change that could easily be done with a deprecation warning. |
Closes pandas-dev#34741, while retaining the spirit of the spirit of pandas-dev#34208.
set
can interchange right and left suffix order #33740black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff