Concatenation of scalar cubes #4720

valeriupredoi · 2022-04-28T14:03:15Z

Hey guys, I have the following workflow in mind, originally pointed to me by @ledm

Problem: concatenate N scalar cubes by simply stacking their data

I have 2 (or N) cubes
I am getting the means of the data for each cube, along axis=0 (time means)
result is: 2 (or N) scalar cubes (after using iris.analysis.MEAN)
I want my resulting two (or N) (scalar) cubes concatenated into a single cube where I am storing two (or N) data points (their means) in an array that'll be the new cube's data object

Currently, iris offers me the chance to do means, but no option to have my result concatenated and stored into a single cube. POC code below..

POC code

import iris.cube
import numpy as np
from cf_units import Unit

c1 = iris.cube.Cube(np.arange(0, 5), var_name='co2', units='J')
c1.add_dim_coord(
    iris.coords.DimCoord(
        np.arange(0., 5., 1.),
        standard_name='time',
        units=Unit('days since 1950-01-01 00:00:00',
        calendar='360_day'),
     ),
    0,
)

c2 = iris.cube.Cube(np.arange(0, 5), var_name='co2', units='J')
c2.add_dim_coord(
    iris.coords.DimCoord(
        np.arange(0., 5., 1.),
        standard_name='time',
        units=Unit('days since 1950-01-01 00:00:00',
        calendar='360_day'),
     ),
    0,
)

c1 = c1.collapsed('time', iris.analysis.MEAN)
c2 = c2.collapsed('time', iris.analysis.MEAN)

cube_list = iris.cube.CubeList([c1, c2])

new_cube_list = []
for cube in cube_list:
    cube.attributes = {}
    cube = iris.util.new_axis(cube, scalar_coord="time")
    new_cube_list.append(cube)

cube_mean = iris.cube.CubeList(new_cube_list).concatenate_cube()
print(len(cube_mean))

This doesn't work, spitting the usual (and very non-descript) error:

Traceback (most recent call last):
  File "/home/users/valeriu/lee_mean.py", line 39, in <module>
    cube_mean = iris.cube.CubeList(new_cube_list).concatenate_cube()
  File "/home/users/valeriu/miniconda3-June2021/envs/esmvaltool-latest/lib/python3.10/site-packages/iris/cube.py", line 565, in concatenate_cube
    raise iris.exceptions.ConcatenateError(msgs)
iris.exceptions.ConcatenateError: failed to concatenate into a single cube.
  An unexpected problem prevented concatenation.
  Expected only a single cube, found 2.

The issue here is that, even if I promote a scalar coordinate to DimCoord in

cube = iris.util.new_axis(cube, scalar_coord="time")

the (single-point) coords are identical, so iris is incapable of concatenating; equally incapable even if I don't promote that time coord. Unless I fully shift one cube's time exis so they don't overlap, I can't concatenate. We have fixed concatenation of overlapping cubes in time in ESMValCore, so it'd be worth implementing that straight into iris. But I digress...

What I want

I want to have concatenate_cube() work and get me a single cube in this case. For that. I'd like the following:

get a descriptive error message why the cubes can't be concatenated in a single one
said error message offer me clues as to what I can do to avert the issue, ie give me an option to eg set a kwarg scalar_cubes = True - at the end of the day, one doesn't care about an axis if a cube is scalar, it's just a stacking of floats in an array that'll be the cube's data
the above two should work without me having to promote any scalar coords to DimCoords and consequently to shift my axis so they don't overlap, to prevent concatenation
also, the above two should work, even if cubes' metadata differs, since, inherently, a multi-model cube like this is assembled from cubes with different metadata attributes and values

How's this sound to you? Cheers muchly 🍺

The text was updated successfully, but these errors were encountered:

zklaus · 2022-04-28T14:10:27Z

Have you tried mergeing?

valeriupredoi · 2022-04-28T14:10:32Z

conversely, you could generalize cube.collapsed(axis, operation) to CubeList.collapsed(axis, operation) that performs two things: collapsing and concatenation into a single cube ❓

zklaus · 2022-04-28T14:11:30Z

See also the general explanation of merge v concatenate in the userguide.

valeriupredoi · 2022-04-28T14:23:16Z

I have too (good point, I should have mentioned) - it doesn't work still - it's complaining about cube duplication:

iris.exceptions.DuplicateDataError: failed to merge into a single cube.
  Duplicate 'co2' cube, with scalar coordinates

ledm · 2022-04-28T14:24:30Z

(I apologise profusely for this. I naively thought I was being silly and that there would be a simple answer to my question - not a 3 page bug report.)

valeriupredoi · 2022-04-28T14:25:26Z

See also the general explanation of merge v concatenate in the userguide.

cheers Klaus, I am aware of that, but I want a working solution for something that is currently not permitted due to obvious rules for regular cubes, but should be allowed for scalar cubes. Me and my corner cases 😁

zklaus · 2022-04-28T14:28:38Z

No worries, @ledm. @valeriupredoi, great, we agree then that this is definitely about merge and not concatenate, but that there might be a bug/shortcoming in the merge routine, right?

zklaus · 2022-04-28T14:39:32Z

Ah, I see the problem now. It is that both cubes have the same time. Hence they have no distinguishing scalar coordinate.

If the goal is to combine two cubes that represent consecutive timespans, use a different time for the second cube, for example np.arange(6., 11.) (the , 1. isn't really doing anything here).

If the goal is to combine two cubes that represent the same timespan, they should be differentiated by something different, perhaps a model name?

Either way in my tests then both concatenate and merge work; concatenate after using the new_axis call, merge without it.
Full MWE:

import iris.cube
import numpy as np
from cf_units import Unit

c1 = iris.cube.Cube(np.arange(0, 5), var_name='co2', units='J')
c1.add_dim_coord(
    iris.coords.DimCoord(
        np.arange(0., 5., 1.),
        standard_name='time',
        units=Unit('days since 1950-01-01 00:00:00',
        calendar='360_day'),
     ),
    0,
)

c2 = iris.cube.Cube(np.arange(0, 5), var_name='co2', units='J')
c2.add_dim_coord(
    iris.coords.DimCoord(
        np.arange(6., 11., 1.),
        standard_name='time',
        units=Unit('days since 1950-01-01 00:00:00',
        calendar='360_day'),
     ),
    0,
)

c1 = c1.collapsed('time', iris.analysis.MEAN)
c2 = c2.collapsed('time', iris.analysis.MEAN)

cube_list = iris.cube.CubeList([c1, c2])

print(cube_list.merge_cube())

new_cube_list = []
for cube in cube_list:
    cube.attributes = {}
    cube = iris.util.new_axis(cube, scalar_coord="time")
    new_cube_list.append(cube)

cube_mean = iris.cube.CubeList(new_cube_list).concatenate_cube()
print(cube_mean)

ledm · 2022-04-28T14:43:10Z

In the original code, this was an attempt at taking a mean of two different ensemble members for the same model and experiment covering the same time period. So while your solution works, it's not the solution to my problem.

rcomer · 2022-04-28T14:44:25Z

I was going to say the same as @zklaus 🙂. My version adds string coords to distinguish two different models. You could equally add a "realization" coordinate to distinguish members from the same ensemble.

import iris.cube
import numpy as np
from cf_units import Unit

c1 = iris.cube.Cube(np.arange(0, 5), var_name='co2', units='J')
c1.add_dim_coord(
    iris.coords.DimCoord(
        np.arange(0., 5., 1.),
        standard_name='time',
        units=Unit('days since 1950-01-01 00:00:00',
        calendar='360_day'),
     ),
    0,
)

c2 = iris.cube.Cube(np.arange(0, 5), var_name='co2', units='J')
c2.add_dim_coord(
    iris.coords.DimCoord(
        np.arange(0., 5., 1.),
        standard_name='time',
        units=Unit('days since 1950-01-01 00:00:00',
        calendar='360_day'),
     ),
    0,
)

c1 = c1.collapsed('time', iris.analysis.MEAN)
c2 = c2.collapsed('time', iris.analysis.MEAN)

cube_list = iris.cube.CubeList([c1, c2])

new_cube_list = []
for cube, model in zip(cube_list, ["model1", "model2"]):
    cube.attributes = {}
    cube.add_aux_coord(iris.coords.AuxCoord(model, long_name="model_id"))
    new_cube_list.append(cube)

cube_mean = iris.cube.CubeList(new_cube_list).merge_cube()
print(cube_mean)

zklaus · 2022-04-28T14:45:39Z

@rcomer beat me to it.

ledm · 2022-04-28T14:46:23Z

"""
c1 = c1.collapsed('time', iris.analysis.MEAN)
c2 = c2.collapsed('time', iris.analysis.MEAN)
"""

But the arrays are no longer scalar if you take the mean of the only dimension, right?

zklaus · 2022-04-28T14:48:50Z

They are even more scalar, aren't they? I mean before they were vectors, now they are scalars. But that doesn't really matter at all. The important point is that there is a scalar coordinate, in this case model_id that allows Iris to aggregate everything. The other dimensions of the cubes play no role.

valeriupredoi · 2022-04-28T14:56:21Z

ahaa! Good call! But this will still not work if the metadata dictionaries differ - and they usually differ like in Lee's case - different ensemble names, parent experiment IDs etc

valeriupredoi · 2022-04-28T14:59:21Z

at any rate, it'd be cool if iris did this under the hood for the user, not the user going about and MacGyver-ing the whole thing 😁

zklaus · 2022-04-28T15:02:42Z

Other metadata getting in the way is a bit off-topic here, imho. It's debatable whether that is a good idea as default behaviour, but to make things simple, there is already now unify_time_units and equalise_attributes.

valeriupredoi · 2022-04-28T15:03:49Z

ah have completely forgotten about equalize_attributes - good call, Klaus!

rcomer · 2022-04-28T15:11:46Z

There is also #4446 for the question of mismatched metadata.

valeriupredoi · 2022-05-18T10:05:42Z

cheers @rcomer - I guess closing this is OK - it'd be nice to have this concat/merge stuff be more lenient as per the issue you link - threw a bunch of ❤️ 's all over there just to show my support 😁

valeriupredoi added the New: Feature label Apr 28, 2022

valeriupredoi mentioned this issue Apr 28, 2022

Possible improvements in cube concatenation to be discussed with iris developers ESMValGroup/ESMValCore#1423

Open

wjbenfold added the Feature: Merge/Concatenate label May 3, 2022

trexfeathers removed the New: Feature label May 5, 2022

trexfeathers closed this as completed May 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concatenation of scalar cubes #4720

Concatenation of scalar cubes #4720

valeriupredoi commented Apr 28, 2022

zklaus commented Apr 28, 2022

valeriupredoi commented Apr 28, 2022

zklaus commented Apr 28, 2022

valeriupredoi commented Apr 28, 2022

ledm commented Apr 28, 2022

valeriupredoi commented Apr 28, 2022

zklaus commented Apr 28, 2022

zklaus commented Apr 28, 2022

ledm commented Apr 28, 2022

rcomer commented Apr 28, 2022

zklaus commented Apr 28, 2022

ledm commented Apr 28, 2022

zklaus commented Apr 28, 2022

valeriupredoi commented Apr 28, 2022

valeriupredoi commented Apr 28, 2022

zklaus commented Apr 28, 2022

valeriupredoi commented Apr 28, 2022

rcomer commented Apr 28, 2022

valeriupredoi commented May 18, 2022

Concatenation of scalar cubes #4720

Concatenation of scalar cubes #4720

Comments

valeriupredoi commented Apr 28, 2022

zklaus commented Apr 28, 2022

valeriupredoi commented Apr 28, 2022

zklaus commented Apr 28, 2022

valeriupredoi commented Apr 28, 2022

ledm commented Apr 28, 2022

valeriupredoi commented Apr 28, 2022

zklaus commented Apr 28, 2022

zklaus commented Apr 28, 2022

ledm commented Apr 28, 2022

rcomer commented Apr 28, 2022

zklaus commented Apr 28, 2022

ledm commented Apr 28, 2022

zklaus commented Apr 28, 2022

valeriupredoi commented Apr 28, 2022

valeriupredoi commented Apr 28, 2022

zklaus commented Apr 28, 2022

valeriupredoi commented Apr 28, 2022

rcomer commented Apr 28, 2022

valeriupredoi commented May 18, 2022