-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: right merge not preserve row order (#27453) #27762
BUG: right merge not preserve row order (#27453) #27762
Conversation
Not sure how to interpret these errors. Azure is reporting a failure on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a whatsnew for v1.0.0
assert_frame_equal(expected, result) | ||
|
||
|
||
def test_left_merge_preserves_row_order(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not already accounted for in other tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None that I could find.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of having these as separate tests can you just parametrize on how = left and right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you do this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also outer and inner should match left ordering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - yeah I can do this. Just started a new job so my schedule got crammed all of a sudden.
pandas/core/reshape/merge.py
Outdated
@@ -1276,6 +1276,11 @@ def _get_join_indexers(left_keys, right_keys, sort=False, how="inner", **kwargs) | |||
indexers into the left_keys, right_keys | |||
|
|||
""" | |||
_how = how |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm this is rather difficult to grok - could you alternately just change _right_outer_join
within the same module to accept a sort
keyword? I think that would make things easier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the sorting actually happens in _factorize_keys
a few lines down, not downstream in _right_outer_merge
. So the left and right keys need to be swapped before that call. _right_outer_join
just implements _left_outer_join
with left and right swapped; no sorting happens there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess what I'm saying is that sorting already happens, but it happens before a distinction is made between left and right joins. So this PR doesn't move where the sorting happens, it just moves where the distinction between left and right joins is made.
That xfail references #7996 if it's any help. The 3.7 CI failures should be fixed if you merge master & repush. |
46fc6ed
to
9007839
Compare
The failing tests are unrelated to this PR.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delayed response as I've been on vacation but this looks to be moving in the right direction!
pandas/core/reshape/merge.py
Outdated
|
||
if how == "right": | ||
rkey, lkey, count = fkeys(rkey, lkey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a more appropriate variable name to use here for the left hand sides? General approach looks good but I think variable reuse can be confusing for future debugging. OK with more appropriate names even if it increases diff
assert_frame_equal(expected, result) | ||
|
||
|
||
def test_left_merge_preserves_row_order(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of having these as separate tests can you just parametrize on how = left and right?
assert_frame_equal(expected, result) | ||
|
||
|
||
def test_left_merge_preserves_row_order(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you do this
assert_frame_equal(expected, result) | ||
|
||
|
||
def test_left_merge_preserves_row_order(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also outer and inner should match left ordering
3f6d119
to
a97763a
Compare
I addressed all the issues that were brought up. Ready for another round of review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implementation and tests lgtm just need some updates on whatsnew
Can you resolve merge conflict? Also should only need one whatsnew entry |
AS per @jreback 's request, I have a whatsnew entry to both the breaking API changes and the Reshaping sections. |
Hello @ncernek! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-01-23 03:21:01 UTC |
I think the request there was to have this show up in Backwards Incompatible API Changes with its own section, not to duplicate it. You can find examples of how to do that in prior whatsnew notes, like here: |
Also looks like a few lint issues need to be addressed |
Does anyone understand what the error is here? This should be good to merge. |
Still need to address the comment around the whatsnew. Please take a look at #27762 (comment) The existing CI failure will probably go away on next push - we've been having intermittent timeouts the past few days that I think should be over now |
I already took care of this is in this commit. |
@ncernek some of the CI failures look related. e.g. https://dev.azure.com/pandas-dev/pandas/_build/results?buildId=20073 |
@ncernek any chance you can check CI failures and resolve conflicts? |
hey, yeah I can take a look later this week.
On Mon, Dec 9, 2019 at 11:09 William Ayd ***@***.***> wrote:
@ncernek <https://github.com/ncernek> any chance you can check CI
failures and resolve conflicts?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#27762?email_source=notifications&email_token=ADNDK64BEGE53SA6QDMEXNDQX2CXHA5CNFSM4IJOCUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGKDLAQ#issuecomment-563361154>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADNDK6Z5SI6IHYX4GWLLN7LQX2CXHANCNFSM4IJOCUZA>
.
--
-
312.450.5311
|
@ncernek able to update? We're planning to release the 1.0 rc this week. |
f6bf25a
to
258df2e
Compare
Co-Authored-By: William Ayd <william.ayd@icloud.com>
258df2e
to
c238b50
Compare
Hmm I'm getting an error locally when running tests. Any ideas?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks pretty good, can you merge master, ping on green.
doc/source/whatsnew/v1.0.0.rst
Outdated
@@ -274,6 +274,7 @@ New repr for :class:`~pandas.arrays.IntervalArray` | |||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |||
|
|||
- :class:`pandas.arrays.IntervalArray` adopts a new ``__repr__`` in accordance with other array classes (:issue:`25022`) | |||
- :class:`pandas.core.arrays.IntervalArray` adopts a new ``__repr__`` in accordance with other array classes (:issue:`25022`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like a rebasing dupe
doc/source/whatsnew/v1.0.0.rst
Outdated
@@ -292,6 +293,32 @@ New repr for :class:`~pandas.arrays.IntervalArray` | |||
|
|||
pd.arrays.IntervalArray.from_tuples([(0, 1), (2, 3)]) | |||
|
|||
- :meth:`DataFrame.merge` now preserves right frame's row order when executing a right merge (:issue:`27453`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this a sub-section (need a line under the title), and put a shorter title, move what you have to the first sentence.
|
||
.. ipython:: python | ||
|
||
left_df = pd.DataFrame({"colors": ["blue", "red"]}, index=pd.Index([0, 1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
show these as well, call then just left and rigth
doc/source/whatsnew/v1.0.0.rst
Outdated
|
||
*pandas 0.25.x* | ||
|
||
.. ipython:: python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs to be a code-block
|
||
.. ipython:: python | ||
|
||
left_df.merge(right_df, left_index=True, right_index=True, how="right") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let this execute (don't put the results in)
doc/source/whatsnew/v1.0.0.rst
Outdated
@@ -949,6 +976,7 @@ Reshaping | |||
- Bug in :func:`melt` where supplying mixed strings and numeric values for ``id_vars`` or ``value_vars`` would incorrectly raise a ``ValueError`` (:issue:`29718`) | |||
- Dtypes are now preserved when transposing a ``DataFrame`` where each column is the same extension dtype (:issue:`30091`) | |||
- Bug in :func:`merge_asof` merging on a tz-aware ``left_index`` and ``right_on`` a tz-aware column (:issue:`29864`) | |||
- :meth:`DataFrame.merge` now preserves right frame's row order when executing a right merge (:issue:`27453`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't put this here, you already have a sub-section
Can you try rebuilding the C extensions? |
I'm not working on this PR anymore. This dev experience is painful. I would have appreciated commits from others to help me get across the finish line a long time ago, esp. when the only issues remaining for the past few months were concerning documentation. Separately, uncertain as to why there are hundreds of test failures. These aren't caused by my contribution that I can tell. That doesn't leave me excited to resolve anything anymore. Anyone else is free to take ownership of this PR. |
That's OK, thanks for letting us know :)
For a start, it looks like you're calling
Sure, will do |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff