Introduces `SumOfFields` #3676

jagoosw · 2024-08-02T15:46:27Z

This PR adds a method to mask_immersed_field! for SumOfArrays. I was trying to make a model which has an auxiliary field which is a SumOfArrays but ran into the error that all of the prognostic and auxiliary fields in the HydrostaticFreeSurfaceModel get masked during the update_state! step.

(Ran into the issue here: OceanBioME/OceanBioME.jl#195)

src/ImmersedBoundaries/mask_immersed_field.jl

glwagner · 2024-08-02T16:35:49Z

How do we know that SumOfArrays always has fields vs arrays? Are they guaranteed to be on the same grid and at the same location? I don't think we built that functionality into SumOfArrays...

glwagner · 2024-08-02T16:37:28Z

It might make sense to take a step back and ask what you're trying to accomplish because I'm not quite sure this makes sense. It might work for you, but seems to rely on hidden assumptions about SumOfArrays which wasn't designed for this

jagoosw · 2024-08-02T18:16:45Z

Hmm yeah, that is true, we never make any guarantees about what is in a SumOfArrays, but I think even at this point we're trusting developers that use it to make sure that the sum is valid (i.e. on the same grid and location), and to only put fields in it where the rest of the code would expect fields right?

In the same way there is no check on what e.g. biogeochemical_auxiliary_fields` is returning so it could return just a normal array which would also break at this step.

I just rearranged it a bit so it just runs mask_immersed_field!(field, value) on each array element rather than mask_immersed_field!(field, loc, grid, value) which is maybe better because this function won't then break (and if a user puts arrays rather than fields it also won't break because it will just do the fallback for each element).

glwagner · 2024-08-02T18:49:02Z

but I think even at this point we're trusting developers that use it to make sure that the sum is valid (i.e. on the same grid and location), and to only put fields in it where the rest of the code would expect fields right?

Well no --- we don't trust developers to do that. We only use it inside a kernel and in a case we know it is used correctly. This PR does something different, it introduces "support" for SumOfArrays that implies to users that it is suitable for representing a sum of fields (not just a low-level utility for sums of arrays), but doesn't implement any other facilities that correspond to such support. We want users to be able to trust the code we write so I feel this isn't the most helpful thing to do.

It is interesting to consider putting together an abstraction that represents a sum of fields (though note we already have this through MultiaryOperation, but maybe there is a reason you would like something different?). It wouldn't be very much effort and we can indeed check that the fields all share a location and grid. Such a utility could even be used within abstract operations, as a simpler alternative to MultiaryOperation (which additionally can sum fields that have different locations, but perhaps is too complicated for your use case?)

glwagner · 2024-08-02T19:17:26Z

Also with SumOfFields we can put in proper support for other things. It can be an AbstractField and be used in operations (perhaps the biggest advantages because users will want to do diagnostics on the sum probably), we can give it a location and expose its grid, and we can support fill_halo_regions and mask_immersed_field. It's a decent idea provided that the normal way of summing fields that produces a BinaryOperation or MultiaryOperation won't work.

jagoosw · 2024-08-05T13:28:09Z

Okay, I understand your point. Since it's not part of the public API I wasn't thinking that this suggests to a user that it should be used, but more made it possible for developers to do things like what my other PR does.

What I meant about trusting users for auxiliary fields is that you can do e.g. this:

model = NonhydrostaticModel(; grid, auxiliary_fields=(; A = randn(1, 2))

and it doesn't error, and the only reason the same for a HydrostaticFreeSurfaceModel errors is from mask_immersed_field! because the mask_immersed_field!(field::Field, value=zero(eltype(field.grid))) method doesn't have a fallback defined.

So this change wouldn't affect how the model failed because putting arrays etc. into the sum would do the same thing.

But yeah, having a SumOfFields makes sense and I can try and do that. The main reason I wanted to use this vs and operation is the speed because when I checked indexing into a sum of arrays is >5x faster than into the equivalent operation (for a 3 field sum).

jagoosw · 2024-08-05T14:10:38Z

I've tried adding it now. I put it in the fields module since it is an AbstractField (and because utils are defined before fields), and it seems to work in abstract operations etc, and is as fast as SumOfArrays.

src/Fields/sum_of_fields.jl

glwagner · 2024-08-05T15:18:05Z

src/Fields/sum_of_fields.jl

+
+adapt_structure(to, sum::SumOfFields{N}) where N = SumOfFields{N}((adapt_structure(to, field) for field in sum.fields)...)
+
+summary(::SumOfFields{N, LX, LY, LZ}) where {N, LX, LY, LZ}  = "Sum of $N fields at ($LX, $LY, $LZ)"


This summary does not state the name of the type (SumOfFields). The purpose of summary and show is to give information about the object so I think it's important to state the name.

Also the summary should be more uniform with other field summaries

I suggest this

function Base.summary(field::SumOfFields{N}) where N LX, LY, LZ = location(field) prefix = string(size_summary(size(field)), " SumOfFields{$LX, $LY, $LZ} with $N fields") reduced_dims = reduced_dimensions(field) suffix = reduced_dims === () ? string(" on ", grid_name(field), " on ", summary(architecture(field))) : string(" reduced over dims = ", reduced_dims, " on ", grid_name(field), " on ", summary(architecture(field))) return string(prefix, suffix) end

This is plagiarized from summary for Fields.

There's also a nice show method for NamedTuple of Fields that could inspire a show method for SumOfFields

glwagner · 2024-08-05T15:18:53Z

src/Fields/sum_of_fields.jl

+@propagate_inbounds function getindex(s::SumOfFields{N, LX, LY, LZ, G, T, F}, i...) where {N, LX, LY, LZ, G, T, F}
+    first = getindex(SumOfFields{3, F, LX, LY, LZ, G, T}((s.fields[1], s.fields[2], s.fields[3]), s.grid), i...)
+    last = getindex(SumOfFields{N - 3, F, LX, LY, LZ, G, T}(s.fields[4:N], s.grid), i...)
+    return first + last


write this on more lines please.

as a rule, try not to nest function calls within one another. Sometimes one-argument functions can be ok. Here though, the inner call has 7 type parameters and two arguments while the outer call has two arguments.

glwagner · 2024-08-05T15:19:38Z

src/Fields/sum_of_fields.jl

+    end
+
+    return nothing
+end


All mask_immersed_field belong in the immersed boundaries module

glwagner · 2024-08-05T15:20:46Z

It might have been better to have a new PR for this because the conversation will be hard to follow, but we can continue.

We need a mission statement for SumOfFields. Why doesn't MultiaryOperation work?

glwagner · 2024-08-05T15:21:44Z

when I checked indexing into a sum of arrays is >5x faster than into the equivalent operation (for a 3 field sum).

Doesn't this suggest there is a problem with MultiaryOperation? It might be nice to document this.

glwagner · 2024-08-05T15:25:23Z

What I meant about trusting users for auxiliary fields is that you can do e.g. this: model = NonhydrostaticModel(; grid, auxiliary_fields=(; A = randn(1, 2)) and it doesn't error

Why do we want to support arrays in auxiliary_fields?

The main usage for auxiliary_fields is that they can be invoked in continuous-form forcing functions. In that usage they have to be AbstractField, otherwise we can't figure out how to interpolate them into a forcing function. auxiliary_fields isn't meant to be a place to store just any information needed for a model (what would the purpose of that be?) We can adapt auxiliary_fields but it would be good to understand why. Separately, perhaps we should check that all auxiliary_fields are AbstractField.

Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>

glwagner · 2024-08-05T16:54:03Z

src/Fields/sum_of_fields.jl

+        grid = first(fields).grid
+        loc = location(first(fields))
+
+        all(f.grid == grid for f in fields) || 


Suggested change

all(f.grid == grid for f in fields) ||

all(f.grid === grid for f in fields) ||

I think object equality is appropriate here. And below. (This will be faster because we do not have to perform numeric comparison)

glwagner · 2024-08-05T16:54:45Z

src/Fields/sum_of_fields.jl

+    end
+end
+
+adapt_structure(to, sum::SumOfFields{N}) where N = SumOfFields{N}((adapt_structure(to, field) for field in sum.fields)...)


please write this on more lines

jagoosw · 2024-08-05T17:11:27Z

So I started making the suggested changes but then decided to do more benchmarking and it appears that when I interpolate the sums into the benchmark properly there isn't a difference (i.e. https://juliaci.github.io/BenchmarkTools.jl/stable/manual/#Interpolating-values-into-benchmark-expressions):

julia> @btime sof[1, 2, 3]
  45.412 ns (1 allocation: 16 bytes)
-1.0065959327459124

julia> @btime bo[1, 2, 3]
  44.571 ns (1 allocation: 16 bytes)
-1.0065959327459124

julia> @btime fbo[1, 2, 3]
  238.209 ns (1 allocation: 16 bytes)
-1.0065959327459124

julia> @btime $(sof)[1, 2, 3]
  3.958 ns (0 allocations: 0 bytes)
-1.0065959327459124

julia> @btime $(bo)[1, 2, 3]
  2.708 ns (0 allocations: 0 bytes)
-1.0065959327459124

julia> @btime $(fbo)[1, 2, 3]
  2.666 ns (0 allocations: 0 bytes)
-1.0065959327459124

where

julia> sof = SumOfFields{3}(c[1:3]...)
200×200×200 SumOfFields{3, Center, Center, Center} on RectilinearGrid on CPU
└── grid: 200×200×200 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on CPU with 3×3×3 halo

julia> bo = sum(c[1:3])
BinaryOperation at (Center, Center, Center)
├── grid: 200×200×200 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on CPU with 3×3×3 halo
└── tree: 
    + at (Center, Center, Center)
    ├── + at (Center, Center, Center)
    │   ├── 200×200×200 Field{Center, Center, Center} on RectilinearGrid on CPU
    │   └── 200×200×200 Field{Center, Center, Center} on RectilinearGrid on CPU
    └── 200×200×200 Field{Center, Center, Center} on RectilinearGrid on CPU

julia> fbo = c[1] + c[2] + c[3]
MultiaryOperation at (Center, Center, Center)
├── grid: 200×200×200 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on CPU with 3×3×3 halo
└── tree: 
    + at (Center, Center, Center)
    ├── 200×200×200 Field{Center, Center, Center} on RectilinearGrid on CPU
    ├── 200×200×200 Field{Center, Center, Center} on RectilinearGrid on CPU
    └── 200×200×200 Field{Center, Center, Center} on RectilinearGrid on CPU

To check that the interpolation into the benchmark wasn't important I also tried benchmarking a kernel that fills an array with the sum and got these results:

julia> @btime launch!(grid.architecture, grid, :xyz, _fill_with_sum!, $(A), $(sof))
  20.066 ms (21 allocations: 15.66 KiB)

julia> @btime launch!(grid.architecture, grid, :xyz, _fill_with_sum!, $(A), $(bo))
  20.174 ms (21 allocations: 19.27 KiB)

julia> @btime launch!(grid.architecture, grid, :xyz, _fill_with_sum!, $(A), $(fbo))
  1.286 s (21 allocations: 15.66 KiB)

julia> @btime launch!(grid.architecture, grid, :xyz, _fill_with_sum!, $(A), $(fbo))
  1.269 s (21 allocations: 15.66 KiB)

julia> @btime launch!(grid.architecture, grid, :xyz, _fill_with_sum!, A, sof)
  20.074 ms (20 allocations: 13.67 KiB)

julia> @btime launch!(grid.architecture, grid, :xyz, _fill_with_sum!, A, bo)
  20.167 ms (20 allocations: 16.77 KiB)

julia> @btime launch!(grid.architecture, grid, :xyz, _fill_with_sum!, A, fbo)
  1.289 s (20 allocations: 13.67 KiB)

So in conclusion SumOfFields isn't needed (but that is a weird result with fbo).

glwagner · 2024-08-05T17:31:21Z

Ok. The case where I think SumOfFields could be needed is if one runs into parameter space issues if its embedded into a tendency kernel function on the GPU. This is a compilation issue that is very hard to fix (performance problems, on the other hand, should be fixable).

jagoosw · 2024-08-05T20:21:41Z

I'm now having the problem that if I return a BinaryOperation in biogeochemical_auxiliary_fields then it gets stuck on mask_immersed_field! again, would it be reasonable to define a fallback like mask_immersed_field!(field, args...) = nothing?

This fallback:

Oceananigans.jl/src/ImmersedBoundaries/mask_immersed_field.jl

Line 8 in bf767af

mask_immersed_field!(field, grid, loc, value) = nothing

is already there but since we get there via:

Oceananigans.jl/src/ImmersedBoundaries/mask_immersed_field.jl

Lines 9 to 10 in bf767af

    
           mask_immersed_field!(field::Field, value=zero(eltype(field.grid))) = 
        
               mask_immersed_field!(field, field.grid, location(field), value)

it fails

glwagner · 2024-08-05T20:25:19Z

Should we extend mask_immersed_field for BinaryOperation? And for MultiaryOperation? This might be useful also for post-processing

glwagner · 2024-08-05T20:26:27Z

We don't want a fallback I don't think --- the failure is intentional. We want to know when we need to define a new method

simone-silvestri · 2024-08-05T20:28:30Z

Masking an operation might be a bit tough because we do not know "a priori" what is the neutral value and most likely there are many ways to "mask" an operation by acting on the memory in the operands. How would this look like practically?

glwagner · 2024-08-05T20:32:41Z

There are a few ways it could be implemented. One simple way is

function mask_immersed_field(bop::BinaryOperation, value=zero(b.grid))
    mask_immersed_field(bop.a, value)
    mask_immersed_field(bop.b, value)
    return nothing
end

One could also be more specific, eg

const FieldBinaryOperation = BinaryOperation{<:Any, <:Any, <:Any, <:Any, <:Field, <:Field}

Then we are sure its a binary operation between two fields. And the operation can be restricted to addition as well.

Just some ideas...

jagoosw · 2024-08-05T20:57:58Z

Okay, that all makes sense.

For BinaryOperation I guess we might want to allow the an entry in the operation to be a number so we could define:

mask_immersed_field(::Number, value) = nothing

but otherwise, I guess would work.

Perhaps if we only do this for BinaryOperations we could do:

#fallback for `+` and `-`, I guess for `*` too since `a` could just be a number so if we only masked `a` it wouldn't be correct
function mask_immersed_field(bop::BinaryOperation, value=zero(b.grid))
    mask_immersed_field(bop.a, value)
    mask_immersed_field(bop.b, value)
    return nothing
end


function mask_immersed_field(bop::BinaryOperation{<:Any, <:Any, <:Any, typeof(/)}, value=zero(b.grid))
    mask_immersed_field(bop.a, value)
    return nothing
end

function mask_immersed_field(bop::BinaryOperation{<:Any, <:Any, <:Any, typeof(^)}, value=zero(b.grid))
    mask_immersed_field(bop.a, value)
    mask_immersed_field(bop.a, one(b.grid))
    return nothing
end

Which would make the values correct for all binary operations I think?

glwagner · 2024-08-05T21:04:11Z

I'm ok to try this out, hopefully we won't run into issues

simone-silvestri · 2024-08-05T21:14:39Z

For * you can just mask one of the so you save computation time (you can use the same function as for /)
Sorry forget this comment, I forgot about the numbers...

jagoosw added 4 commits August 2, 2024 11:36

add masking for sum of arrays

4ac2240

added the correct one now

84bf810

first step

221f5bc

bump patch number

eb586a6

jagoosw requested a review from glwagner August 2, 2024 15:49

glwagner reviewed Aug 2, 2024

View reviewed changes

src/ImmersedBoundaries/mask_immersed_field.jl Outdated Show resolved Hide resolved

rearranged

484281d

added SumOfFields

efcaa18

jagoosw changed the title ~~Adds method to mask_immersed_field! for SumOfArrays~~ Introduces SumOfFields Aug 5, 2024

glwagner reviewed Aug 5, 2024

View reviewed changes

src/Fields/sum_of_fields.jl Outdated Show resolved Hide resolved

glwagner reviewed Aug 5, 2024

View reviewed changes

src/Fields/sum_of_fields.jl Outdated Show resolved Hide resolved

glwagner reviewed Aug 5, 2024

View reviewed changes

src/Fields/sum_of_fields.jl

end

return nothing

end

Copy link

Member

glwagner Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All mask_immersed_field belong in the immersed boundaries module

Apply suggestions from code review

343561c

Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>

glwagner reviewed Aug 5, 2024

View reviewed changes

jagoosw closed this Aug 5, 2024

jagoosw mentioned this pull request Aug 6, 2024

Adds methods to mask_immersed_field! for BinaryOperations #3683

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduces `SumOfFields` #3676

Introduces `SumOfFields` #3676

jagoosw commented Aug 2, 2024 •

edited

Loading

glwagner commented Aug 2, 2024

glwagner commented Aug 2, 2024

jagoosw commented Aug 2, 2024 •

edited

Loading

glwagner commented Aug 2, 2024 •

edited

Loading

glwagner commented Aug 2, 2024

jagoosw commented Aug 5, 2024

jagoosw commented Aug 5, 2024

glwagner Aug 5, 2024

glwagner Aug 5, 2024

glwagner Aug 5, 2024

glwagner Aug 5, 2024

glwagner Aug 5, 2024

glwagner commented Aug 5, 2024

glwagner commented Aug 5, 2024

glwagner commented Aug 5, 2024

glwagner Aug 5, 2024

glwagner Aug 5, 2024

jagoosw commented Aug 5, 2024 •

edited

Loading

glwagner commented Aug 5, 2024 •

edited

Loading

jagoosw commented Aug 5, 2024 •

edited

Loading

glwagner commented Aug 5, 2024

glwagner commented Aug 5, 2024

simone-silvestri commented Aug 5, 2024

glwagner commented Aug 5, 2024

jagoosw commented Aug 5, 2024

glwagner commented Aug 5, 2024

simone-silvestri commented Aug 5, 2024 •

edited

Loading


		adapt_structure(to, sum::SumOfFields{N}) where N = SumOfFields{N}((adapt_structure(to, field) for field in sum.fields)...)

		summary(::SumOfFields{N, LX, LY, LZ}) where {N, LX, LY, LZ} = "Sum of $N fields at ($LX, $LY, $LZ)"

	all(f.grid == grid for f in fields) \|\|
	all(f.grid === grid for f in fields) \|\|

Introduces SumOfFields #3676

Introduces SumOfFields #3676

Conversation

jagoosw commented Aug 2, 2024 • edited Loading

glwagner commented Aug 2, 2024

glwagner commented Aug 2, 2024

jagoosw commented Aug 2, 2024 • edited Loading

glwagner commented Aug 2, 2024 • edited Loading

glwagner commented Aug 2, 2024

jagoosw commented Aug 5, 2024

jagoosw commented Aug 5, 2024

glwagner Aug 5, 2024

Choose a reason for hiding this comment

glwagner Aug 5, 2024

Choose a reason for hiding this comment

glwagner Aug 5, 2024

Choose a reason for hiding this comment

glwagner Aug 5, 2024

Choose a reason for hiding this comment

glwagner Aug 5, 2024

Choose a reason for hiding this comment

glwagner commented Aug 5, 2024

glwagner commented Aug 5, 2024

glwagner commented Aug 5, 2024

glwagner Aug 5, 2024

Choose a reason for hiding this comment

glwagner Aug 5, 2024

Choose a reason for hiding this comment

jagoosw commented Aug 5, 2024 • edited Loading

glwagner commented Aug 5, 2024 • edited Loading

jagoosw commented Aug 5, 2024 • edited Loading

glwagner commented Aug 5, 2024

glwagner commented Aug 5, 2024

simone-silvestri commented Aug 5, 2024

glwagner commented Aug 5, 2024

jagoosw commented Aug 5, 2024

glwagner commented Aug 5, 2024

simone-silvestri commented Aug 5, 2024 • edited Loading

Introduces `SumOfFields` #3676

Introduces `SumOfFields` #3676

jagoosw commented Aug 2, 2024 •

edited

Loading

jagoosw commented Aug 2, 2024 •

edited

Loading

glwagner commented Aug 2, 2024 •

edited

Loading

jagoosw commented Aug 5, 2024 •

edited

Loading

glwagner commented Aug 5, 2024 •

edited

Loading

jagoosw commented Aug 5, 2024 •

edited

Loading

simone-silvestri commented Aug 5, 2024 •

edited

Loading