-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable casting to IamDataFrame multiple times #396
Enable casting to IamDataFrame multiple times #396
Conversation
98df565
to
3482a93
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. It might be worth clarifying this in the docs too. It’s the right behaviour I think but this ability to change two things at once is also one of the most surprising features so might be worth discussing a bit more.
I think this does what it initially set out to do, and am happy to approve it. However the one use-case that I would understand for this is 'refreshing' databases. I don't know if you're aware (@znicholls has described this as a bug before but I assumed it was intentional) but changing the data doesn't update the values extracted by calls like "df.scenarios()", so I often call df = DataFrame(df.data) to ensure everything is consistent after modifying data directly. To me it would be most intuitive if these had the same effect regarding consistency. |
I'm not quite sure what you mean with "refreshing" databases - do you mean an
Maybe bring this into a new issue if it should be discussed more thoroughly? About the use case, I added a section to the PR description. |
|
@znicholls added a test and an additional paragraph in the notes on the to-be-expected behavior. Please say if that is sufficiently clear! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, couple of minor things I'd tweak
@gidden, do you mind taking a look? |
88288dd
to
9b4c904
Compare
Codecov Report
@@ Coverage Diff @@
## master #396 +/- ##
==========================================
+ Coverage 92.49% 93.42% +0.92%
==========================================
Files 35 35
Lines 4051 4046 -5
==========================================
+ Hits 3747 3780 +33
+ Misses 304 266 -38
Continue to review full report at Codecov.
|
Please confirm that this PR has done the following:
Description of PR
I recently noticed that casting an IamDataFrame again (i.e.,
IamDataFrame(IamDataFrame(<file>))
raises a non-intuitive error message.This PR implements a simple approach to circumvent this problem, with all attributes of the first and second casting of the IamDataFrame instance pointing to the same objects. This follows the logic for pandas.DataFrames (as far as I understand it) where performing an operation on one instance has an impact also on the second (see the new test to show how this works in pyam).
An alternative approach may be to do a full copy during the double-casting.
Use case
I wrote a script recently that cast an object to an
IamDataFrame
(see here) before performing a few checks and operations - this failed when the object was already anIamDataFrame
. That function has been extended to first check before casting, but this may be a problem that other users run into.