-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Explode multiple columns of DataFrame #28465
Conversation
explode multiple columns at same time
Now if you pass a list of column names to .explode(), so long as all the lengths of lists are consistent across all the columns for each records, all the columns will be exploded.
ENH: DataFrame.explode() allow for multiple columns
Now explode() can also take in a list of columns and explode them all, given that for every record in the dataframe the elements of the exploding columns all have the same length
ENH: DataFrame.explode() multiple columns
Hello @stahl085! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2019-09-16 16:48:53 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
always add tests first
Just a heads up - this was part of the original implementation in #27267 so you can check there for inspiration on tests and implementation. The big blocker there though was how to handle duplicate values, i.e. whether we should generate a cartesian product or not. Do you know how other similar tools would handle that? |
Thanks for the info, I like that #27267 implementation much better! I haven't seen this implemented elsewhere, but to me it seems un-natural for this to return a cartesian product. Would it make sense to include that as an optional argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would need a number of tests; the impl will be very non-performant, so needs updating.
Now .explode() can take a list of column names and will explode multiple at the same time (given that each element across all the columns have the same length in every single row
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff