-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: change sort behavior in stack() so it's user-directed #35343
Comments
pls show an example that reproduces |
Output:
|
pls show versions as instructed |
I opened this request from quoting the line in code. I didn't get any instructions, I just copied other people's format that I saw. The version is quoted above, in the link to the source code. It's pretty clearly described how |
there are pretty clear instructions when u open an issue w/o showing versions |
Thanks @pmberkeley for the report. One of the properties of a DataFrame is implicit ordering. from https://arxiv.org/abs/2001.00888
so I would regard this as a bug rather than an enhancement. I think the OP is more detailed than necessary, maybe just a small example of stack changing the order of the index elements. |
@simonjayhawkins thanks for the info! I was surprised by the behavior, but didn't know that implicit ordering was the expectation. Do I need to change the title to "BUG" (or whatever it is supposed to be)? |
@jreback I recommend you go to the source code and try out the github functionality that lets you open an issue directly from a line of code. It provides you with zero instructions. |
I think this issue is covered by #15105, so closing as duplicate. lmk if I misunderstood something. |
@simonjayhawkins nope, that's exactly it. Thanks! |
pandas/pandas/core/reshape/reshape.py
Line 623 in bfac136
Is your feature request related to a problem?
I need a multiindex dataframe to stack in a specific order. I also need the columns to fail the
this.columns.is_lexsorted()
test (the duplicated column names are how I'm merging the data while stacking it, as a workaround to not being able to get alinspace
result out of therolling
method). Currently, thesort_index
method is causing the dataframe to reorder alphabetically, which is not the order I need it to stack in.Describe the solution you'd like
One of the following (preferably not the last):
API breaking implications
Unsure about option 1, but the default suggested in option 2 should be fine.
Describe alternatives you've considered
This is already a workaround for the
rolling
method not working with non-scalar outputs. I'll be renaming the columns in the short term, but this seems hacky.Additional context
This issues is closely related to the functionality of
rolling
. Thestack
method is being used/suggested as a workaround for the inability to userolling
to output linear results.MultiIndex
is the pandas way of dealing with additional dimensionality of data; when the rolling method doesn't play nice with adding a dimension to the data set, the user then ends up trying to recreate arolling
method equivalent viastack
(and other means); in order forstack
and other methods to work well in this context, sorting behavior has to be explicit.The text was updated successfully, but these errors were encountered: