-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: avoid calling .values to know the result dtype in eval() #44791
PERF: avoid calling .values to know the result dtype in eval() #44791
Conversation
(i.e. calling ``mgr.as_array()`` or ``df.values``). | ||
""" | ||
if len(self.arrays) == 0: | ||
return np.dtype(float) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be explicitly np.float64? i never know how this will behave on windows or 32bit builds
@@ -1549,6 +1549,22 @@ def as_array( | |||
|
|||
return arr.transpose() | |||
|
|||
def as_array_dtype(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be shared in the base class?
This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this. |
@jorisvandenbossche can you merge master and update |
@jorisvandenbossche can you address comments and rebase |
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
prob a reasonable change but closing as stale |
Currently, the
pd.eval(..)
expression parser calls.values
several times just to know the dtype of that array. When you actually need to construct this array (which can be costly), we can know this dtype without actually constructing it.TODO: I need to update the
as_array_dtype
method to ensure it always returns anp.dtype
(now it can also return an extension dtype)