Check output consistency of `.validate` and similar #428

byersiiasa · 2020-09-23T08:13:54Z

Came across an issue with the newest version where an output of .validate(), which I think was previously a pd.DataFrame is now a pd.Series.
Not a problem in itself - but scripts that then take this output and perform operations that only work with DataFrame will throw an error. - So backward compatibility is reduced.

In our case it was .drop(..., axis=1) (i.e. drop columns), which is not possible on pd.Series

Error given is:
ValueError: No axis named 1 for object type <class 'pandas.core.series.Series'>

The text was updated successfully, but these errors were encountered:

danielhuppmann · 2020-09-25T07:20:07Z

Thanks @byersiiasa for identifying this issue!

Let me flip this back to you (also pinging @kvanderwijst @gidden @jkikstra): what would be the "natural" return type of this function?

For the aggregation features, we recently switched all returned objects to an IamDataFrame - would that make sense here as well? Also considering that there is now a info() function (see #427) which is shown by default in Jupyter notebooks...

gidden · 2020-09-25T10:41:41Z

If the return type is always guaranteed to be 1-dimensional, then I would say pd.Series in order to stay in the pandas-verse. Assumptions here based on the docstring:

Returns all scenarios which do not match the criteria and prints a log message, or returns None if all scenarios match the criteria.

danielhuppmann · 2020-09-28T15:28:20Z

If staying in the pandas-verse is preferred, I'd rather revert the behavior to a DataFrame rather than a Series. From my own experience, I usually use this in a notebook context where the DataFrame is quicker to parse (visually) than a Series with a complex MultiIndex...

byersiiasa · 2020-09-28T20:00:05Z

I would probably favour either type DF over Series - Series almost always give me headaches

jkikstra · 2020-09-29T08:06:58Z

I also prefer pandas.dataframe.
I find myself using .reset_index() in my workflows basically everytime I "happen to find" a MultiIndex coming from a pyam function.

danielhuppmann · 2020-09-29T08:09:39Z

Thanks @jkikstra - good that I implemented it that way yesterday evening... 😜 Added you as a reviewer.

Please create a new issue for any other functions or methods where a non-intuitive or impractical type is returned.

byersiiasa added the next release label Sep 23, 2020

byersiiasa assigned danielhuppmann and macflo8 Sep 23, 2020

danielhuppmann added data back-end Anything related to the (timeseries) data back end implementation and removed next release labels Sep 25, 2020

danielhuppmann unassigned macflo8 Sep 25, 2020

danielhuppmann mentioned this issue Sep 28, 2020

Fix return type of validation after data refactoring #429

Merged

3 tasks

danielhuppmann closed this as completed in #429 Oct 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check output consistency of `.validate` and similar #428

Check output consistency of `.validate` and similar #428

byersiiasa commented Sep 23, 2020

danielhuppmann commented Sep 25, 2020

gidden commented Sep 25, 2020 •

edited

Loading

danielhuppmann commented Sep 28, 2020

byersiiasa commented Sep 28, 2020

jkikstra commented Sep 29, 2020

danielhuppmann commented Sep 29, 2020

Check output consistency of .validate and similar #428

Check output consistency of .validate and similar #428

Comments

byersiiasa commented Sep 23, 2020

danielhuppmann commented Sep 25, 2020

gidden commented Sep 25, 2020 • edited Loading

danielhuppmann commented Sep 28, 2020

byersiiasa commented Sep 28, 2020

jkikstra commented Sep 29, 2020

danielhuppmann commented Sep 29, 2020

Check output consistency of `.validate` and similar #428

Check output consistency of `.validate` and similar #428

gidden commented Sep 25, 2020 •

edited

Loading