-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix column cleaning and general cleanup #39
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updated tests. I had a couple of questions but those shouldn't block merge.
@@ -120,15 +96,28 @@ def get_df_variable_name( | |||
""" | |||
Returns the variable name of the DataFrame object. | |||
""" | |||
logger.debug("looking for matching variables for dataframe") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious. What is the use case for this utility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was previously for the blueprintjs callout to the user after a resample request that returned instructions to pass the filtered subset to a new variable, like:
new_df = my_awesome_df.query( < pandas query string >, engine="python")
But it's going to be passed in the metadata to the frontend to let DEX handle displaying the query syntax dynamically when the filters change, instead of just on (backend) resample requests. I'd like to have this handled in the dataframe caching branch for easier access.
s = geometry.handle_geometry_series(s) | ||
|
||
s = datatypes.handle_dict_series(s) | ||
s = datatypes.handle_sequence_series(s) | ||
s = datatypes.handle_unk_type_series(s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the type unknown for these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly as a catch-all for if we run into objects that aren't JSON serializable (custom classes/objects) that we haven't handled in an explicit way earlier, to avoid breaking the pandas build_json_schema()
call. Definitely open to suggestions here!
None
,np.nan
,pd.NA
)known issue:
pd.Period
/pd.Interval
) since pandas can't parse a stringified list, so we might need to keep track of some intermediate normalized set of dtypes instead of the "original" dtypes