Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable casting to IamDataFrame multiple times #396

Merged

Conversation

danielhuppmann
Copy link
Member

@danielhuppmann danielhuppmann commented Jun 14, 2020

Please confirm that this PR has done the following:

  • Tests Added
  • Documentation Added
  • Description in RELEASE_NOTES.md Added

Description of PR

I recently noticed that casting an IamDataFrame again (i.e., IamDataFrame(IamDataFrame(<file>)) raises a non-intuitive error message.

This PR implements a simple approach to circumvent this problem, with all attributes of the first and second casting of the IamDataFrame instance pointing to the same objects. This follows the logic for pandas.DataFrames (as far as I understand it) where performing an operation on one instance has an impact also on the second (see the new test to show how this works in pyam).

An alternative approach may be to do a full copy during the double-casting.

Use case

I wrote a script recently that cast an object to an IamDataFrame (see here) before performing a few checks and operations - this failed when the object was already an IamDataFrame. That function has been extended to first check before casting, but this may be a problem that other users run into.

@danielhuppmann danielhuppmann changed the title Enable casting to IamDataFrame mutliple times Enable casting to IamDataFrame multiple times Jun 14, 2020
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.2%) to 90.888% when pulling 98df565 on danielhuppmann:feature/init_from_iamdf into 808294b on IAMconsortium:master.

@danielhuppmann danielhuppmann force-pushed the feature/init_from_iamdf branch from 98df565 to 3482a93 Compare June 14, 2020 18:48
Copy link
Collaborator

@znicholls znicholls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. It might be worth clarifying this in the docs too. It’s the right behaviour I think but this ability to change two things at once is also one of the most surprising features so might be worth discussing a bit more.

tests/test_core.py Show resolved Hide resolved
@Rlamboll
Copy link
Collaborator

I think this does what it initially set out to do, and am happy to approve it. However the one use-case that I would understand for this is 'refreshing' databases. I don't know if you're aware (@znicholls has described this as a bug before but I assumed it was intentional) but changing the data doesn't update the values extracted by calls like "df.scenarios()", so I often call df = DataFrame(df.data) to ensure everything is consistent after modifying data directly. To me it would be most intuitive if these had the same effect regarding consistency.

@danielhuppmann
Copy link
Member Author

However the one use-case that I would understand for this is 'refreshing' databases. I don't know if you're aware (@znicholls has described this as a bug before but I assumed it was intentional) but changing the data doesn't update the values extracted by calls like "df.scenarios()", so I often call df = DataFrame(df.data) to ensure everything is consistent after modifying data directly. To me it would be most intuitive if these had the same effect regarding consistency.

I'm not quite sure what you mean with "refreshing" databases - do you mean an IamDataFrame? Also, "everything is consistent" - this should be the case anyway if you use the proper functions like rename(), convert_unit(), append().

df.scenarios() returns a list of scenarios at the time when you call the function - it is not intended to be updated when df is changed...

Maybe bring this into a new issue if it should be discussed more thoroughly?

About the use case, I added a section to the PR description.

@Rlamboll
Copy link
Collaborator

  • this should be the case anyway if you use the proper functions like rename(), convert_unit(), append().
    I don't, I directly operate on the .data and then make it consistent afterwards. I don't think it's necessarily a problem, just something to be aware of. I will start a "bug report" for clarity anyway.

@danielhuppmann
Copy link
Member Author

@znicholls added a test and an additional paragraph in the notes on the to-be-expected behavior. Please say if that is sufficiently clear!

Copy link
Collaborator

@znicholls znicholls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, couple of minor things I'd tweak

tests/test_core.py Show resolved Hide resolved
pyam/core.py Show resolved Hide resolved
@danielhuppmann
Copy link
Member Author

@gidden, do you mind taking a look?

@danielhuppmann danielhuppmann force-pushed the feature/init_from_iamdf branch from 88288dd to 9b4c904 Compare June 16, 2020 20:21
@codecov
Copy link

codecov bot commented Jun 29, 2020

Codecov Report

Merging #396 into master will increase coverage by 0.92%.
The diff coverage is 95.23%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #396      +/-   ##
==========================================
+ Coverage   92.49%   93.42%   +0.92%     
==========================================
  Files          35       35              
  Lines        4051     4046       -5     
==========================================
+ Hits         3747     3780      +33     
+ Misses        304      266      -38     
Impacted Files Coverage Δ
setup.py 0.00% <ø> (ø)
tests/test_tutorials.py 96.42% <0.00%> (+45.81%) ⬆️
pyam/core.py 92.37% <100.00%> (+0.08%) ⬆️
tests/test_core.py 99.65% <100.00%> (+<0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0a25b29...e135a8c. Read the comment docs.

@danielhuppmann danielhuppmann merged commit 7ff834f into IAMconsortium:master Jun 29, 2020
@danielhuppmann danielhuppmann deleted the feature/init_from_iamdf branch June 29, 2020 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants