Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to skip existing intermediate variables when aggregating recursivly #532

Merged
merged 35 commits into from
Jun 10, 2021
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
17b132d
implemented skip_intermediate option
pjuergens May 12, 2021
bc064d9
Updated release notes
pjuergens May 12, 2021
6a30ec6
Test added
pjuergens May 12, 2021
0ec0804
Bugfix in tests
pjuergens May 12, 2021
2b0ddaa
styleguide checked
pjuergens May 12, 2021
a7d1d4e
fix Style guide
pjuergens May 12, 2021
b4509e5
finally code style
pjuergens May 12, 2021
7b4e266
fix style guide
pjuergens May 12, 2021
f939a8e
changed implementation to deal with different scenarios
pjuergens May 14, 2021
a452f2e
added test for different scenarios
pjuergens May 14, 2021
39ea9bf
fixed test
pjuergens May 14, 2021
5a2ced2
skip intermediate variable also at highest level
pjuergens May 14, 2021
144b39d
Bugfix
pjuergens May 14, 2021
d6db93a
Bugfix test
pjuergens May 14, 2021
cbf4eeb
again bugfix test
pjuergens May 14, 2021
0a5b53d
formatting black
pjuergens May 14, 2021
0d7c11c
fix black style
pjuergens May 14, 2021
b904073
added aggregation check
pjuergens May 14, 2021
9ab914e
black style guide
pjuergens May 14, 2021
0b9c26f
changed interface of recursive aggregation
pjuergens May 19, 2021
7edbdda
fixed black style
pjuergens May 19, 2021
7a18b43
automated black style with spyder
pjuergens May 19, 2021
cc581c8
change back to stickler black style
pjuergens May 19, 2021
70b1d70
Update pyam/_aggregate.py
pjuergens May 25, 2021
d0ad181
Merge branch 'main' into intermediate-aggregate
pjuergens May 25, 2021
364fe01
Switch order to ['left', 'right'] in returned object from `compare()`
danielhuppmann May 22, 2021
66cabcb
Move internal implementation of compare to own module
danielhuppmann May 28, 2021
c7d7168
Save `_data` as pd.Series in `swap_time_for_year()`
danielhuppmann May 29, 2021
d188e00
Implement a once-through aggregate-and-validate method
danielhuppmann May 29, 2021
d43351d
Move recursive-aggregation data to conftest.py
danielhuppmann May 29, 2021
e3775bc
Add validation that recursive aggregation fails if data is inconsistent
danielhuppmann May 29, 2021
b4f65f4
Fix the test of the compare function (changed order of cols)
danielhuppmann May 23, 2021
1d386fa
Fix calling the internal `compare` function
danielhuppmann May 29, 2021
f6464a8
Merge pull request #2 from danielhuppmann/intermediate-aggregate-alt
pjuergens Jun 10, 2021
25edae0
Merge branch 'main' into intermediate-aggregate
pjuergens Jun 10, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions RELEASE_NOTES.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Next Release

- [#532](https://github.com/IAMconsortium/pyam/pull/532) Add an option to skip existing intermediate variables when aggregating recursivly
- [#527](https://github.com/IAMconsortium/pyam/pull/527) Add an in-dataframe basic mathematical operations `subtract`, `add`, `multiply`, `divide`
- [#519](https://github.com/IAMconsortium/pyam/pull/519) Enable explicit `label` and fix for non-string items in plot legend

Expand Down
5 changes: 4 additions & 1 deletion pyam/_aggregate.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def _aggregate(df, variable, components=None, method=np.sum):
return _group_and_agg(_df, [], method)


def _aggregate_recursive(df, variable):
def _aggregate_recursive(df, variable, skip_intermediate=False):
"""Recursive aggregation along the variable tree"""

# downselect to components of `variable`, initialize list for aggregated (new) data
Expand All @@ -69,6 +69,9 @@ def _aggregate_recursive(df, variable):
components = compress(_df.variable, find_depth(_df.variable, level=d + 1))
var_list = set([reduce_hierarchy(v, -1) for v in components])

if skip_intermediate:
# skip aggregating variables that already exist in dataframe _df
var_list = var_list - set(_df.variable)
# a temporary dataframe allows to distinguish between full data and new data
temp_df = _df.aggregate(variable=var_list)
_df.append(temp_df, inplace=True)
Expand Down
15 changes: 13 additions & 2 deletions pyam/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -1163,7 +1163,13 @@ def normalize(self, inplace=False, **kwargs):
return ret

def aggregate(
self, variable, components=None, method="sum", recursive=False, append=False
self,
variable,
components=None,
method="sum",
recursive=False,
append=False,
skip_intermediate=False
pjuergens marked this conversation as resolved.
Show resolved Hide resolved
):
"""Aggregate timeseries by components or subcategories within each region

Expand All @@ -1179,6 +1185,9 @@ def aggregate(
Iterate recursively (bottom-up) over all subcategories of `variable`.
append : bool, optional
Whether to append aggregated timeseries data to this instance.
skip_intermediate : bool, optional
Skip aggregating already existing intermediate variables. Only has an
effect if recursive=True

Returns
-------
Expand All @@ -1199,7 +1208,9 @@ def aggregate(
"Recursive aggregation only supported with `method='sum'`!"
)

_df = IamDataFrame(_aggregate_recursive(self, variable), meta=self.meta)
_df = IamDataFrame(
_aggregate_recursive(self, variable, skip_intermediate), meta=self.meta
)
else:
_df = _aggregate(self, variable, components=components, method=method)

Expand Down
29 changes: 29 additions & 0 deletions tests/test_feature_aggregate.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,35 @@ def test_aggregate_recursive(time_col):
assert_iamframe_equal(df_minimal, df)


@pytest.mark.parametrize("time_col", (("year"), ("time")))
def test_aggregate_skip_intermediate(time_col):
# use the feature `recursive=True` and `skip_intermediate=True`
data = (
RECURSIVE_DF
if time_col == "year"
else RECURSIVE_DF.rename(DTS_MAPPING, axis="columns")
)
df = IamDataFrame(data, model="model_a", scenario="scen_a", region="World")
df2 = df.rename(scenario={"scen_a": "scen_b"})
df2.data.value *= 2
pjuergens marked this conversation as resolved.
Show resolved Hide resolved
df.append(df2, inplace=True)

# create object without variables to be aggregated, but with intermediate variables
v = "Secondary Energy|Electricity"
agg_vars = [f"{v}{i}" for i in [""]]
df_minimal = df.filter(variable=agg_vars, keep=False)

# return recursively aggregated data as new object
obs = df_minimal.aggregate(variable=v, recursive=True, skip_intermediate=True)
assert_iamframe_equal(obs, df.filter(variable=agg_vars))

# append to `self`
df_minimal.aggregate(
variable=v, recursive=True, append=True, skip_intermediate=True
)
assert_iamframe_equal(df_minimal, df)


@pytest.mark.parametrize(
"variable, append", (("Primary Energy|Coal", "foo"), (False, True))
)
Expand Down