Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ArrayManager] DataFrame constructors #39991
[ArrayManager] DataFrame constructors #39991
Changes from 11 commits
61983d8
1d0315f
ffc8314
46e73c8
3e108df
854bb17
8726d42
6e17183
8096665
aef4cc8
9c0a3d6
1eb5cb7
0992e67
936b290
54d36ab
c56ffa8
164387c
143b572
6166927
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you rename typ -> manager?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this is the same keyword as I use for
arrays_to_mgr
,mgr_to_mgr
, etc, where I think thetyp
keyword makes sense (it's already clear from the name of the function that it is creating a manager).Alternatively, we could make the
init_dict
function name more explicit (and consistent with the others) and eg rename todict_to_mgr
. I am happy to do this in a separate PR.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok yeah +1 on the rename (followon ok ) and anything to make the keyword more obvious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> #40074
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could do this unconditionally. tradeoff: slightly less complex code here, perf of unecessary wrapping/unwrapping in DatetimeBlock/TimedeltaBlock._maybe_coerce_values (for now, until upcoming PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would then leave that until it is useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you do this inside e.g. in ArrayManager constructor (or add a constructor on L127), prob should do that anyhow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean exactly with that?
We already use the main constructor
ArrayManager(arrays, [index, columns])
on L127 because that constructor already works from "arrays" (in constrast to the BlockManager, which needs to consolidate arrays etc, so therefore there is a helpercreate_blockmanager_from_arrays
, but for ArrayManager this is exactly the__init__
already)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the suggestion is to move the call on L115 to just before the
return ArrayManager
on L127. i think thatd be a small improvement, not a big dealThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intentionally put it here because, in theory, I think this wrapping only needs to be done with the default of
verify_integrity=True
(so you have a way to skip it)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean on line L127 this should call a method inside internal and not directly calling the manager. Then you can handle the special cases. on L110 inside that function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you want me to write this function?
(IMO this just splits the handling of
verify_integrity
to multiple places (as mentioned above, that's the reason that I put this wrapping where it is located now), but I don't care if this makes you happy)Note you are commenting on a file that already is inside internals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its L110 which needs to be handled in array_manager.py and not here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, moved inside
array_manager.py