Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raster: Read raster for mask from env variable #2392

Open
wants to merge 52 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
7e5ba56
raster: Read raster for mask from env variable
wenzeslaus May 21, 2022
4f7b08a
Merge remote-tracking branch 'upstream/main' into mask-env-var
neteler Nov 7, 2023
1f9a63c
Merge remote-tracking branch 'upstream/main' into mask-env-var
wenzeslaus Sep 27, 2024
36f0e67
Use separate function for name of the mask
wenzeslaus Oct 11, 2024
a9fe1d4
Merge remote-tracking branch 'upstream/main' into mask-env-var
wenzeslaus Oct 11, 2024
2d987b9
Improve doc
wenzeslaus Oct 11, 2024
a8b23f6
Merge remote-tracking branch 'upstream/main' into mask-env-var
wenzeslaus Oct 11, 2024
7065da2
Use env var in r.mapcalc test to disable masking instead of managing …
wenzeslaus Oct 14, 2024
b14ce17
Add Python context manager for mask env variable and start tests
wenzeslaus Oct 15, 2024
39bf7e9
Merge with main
wenzeslaus Oct 15, 2024
586158b
Update auto_mask.c
echoix Oct 16, 2024
a33358d
Merge branch 'main' into mask-env-var
echoix Oct 16, 2024
f79c613
Merge remote-tracking branch 'upstream/main' into mask-env-var
wenzeslaus Oct 31, 2024
cfa78be
New mask handling for raster md5 sh test and sync the const test
wenzeslaus Oct 31, 2024
900b67d
Use unique names for test
wenzeslaus Oct 31, 2024
58d157a
Merge remote-tracking branch 'upstream/main' into mask-env-var
wenzeslaus Oct 31, 2024
0920569
Merge remote-tracking branch 'upstream/main' into mask-env-var
wenzeslaus Oct 31, 2024
d6a41d7
Add tests for different situations (implementation is still incomplete)
wenzeslaus Oct 31, 2024
4465c56
Support arbitrary mask name in the library (updates r.mask.status beh…
wenzeslaus Oct 31, 2024
967bd73
Merge remote merge
wenzeslaus Oct 31, 2024
d5ef245
Fix reclass retrival
wenzeslaus Nov 1, 2024
84f669b
Create a new internal function to separate testing of the presence an…
wenzeslaus Nov 1, 2024
a44a5f3
Support user-provided mask name in r.mask
wenzeslaus Nov 1, 2024
91da751
Use raster mask, not MASK in r.mask documentation in code
wenzeslaus Jan 27, 2025
3d3b117
Merge remote-tracking branch 'upstream/main' into mask-env-var
wenzeslaus Jan 27, 2025
40bdc2b
Add MaskManager to init
wenzeslaus Jan 27, 2025
3a3901e
Merge remote-tracking branch 'upstream/main' into mask-env-var
wenzeslaus Jan 29, 2025
7ae34f6
Use global variable instead of a hardcoded name just to make it more …
wenzeslaus Jan 29, 2025
84ce225
r.fillnulls: Use a custom mask (but don't set it) instead of moving a…
wenzeslaus Jan 29, 2025
7dfc51b
Merge remote-tracking branch 'upstream/main' into mask-env-var
wenzeslaus Jan 29, 2025
26ffe8a
r.in.wms: Refactor into a function and use with-statement
wenzeslaus Jan 29, 2025
c026601
MaskManager: No need to talk about removing a mask, that's obvious
wenzeslaus Jan 29, 2025
3bd24bb
Update best pratices for managing mask
wenzeslaus Jan 29, 2025
0a7bf59
Use a global variable for the default mask name in tests
wenzeslaus Jan 29, 2025
aef5f41
A simple change updating raster into to the new mask wording and flex…
wenzeslaus Jan 29, 2025
035b80f
r.mask: Document the new behavior in a basic way and center documenta…
wenzeslaus Jan 29, 2025
2ae76b7
Replace MASK by mask in source code comments
wenzeslaus Jan 29, 2025
449f44d
Basic update for Doxygen doc
wenzeslaus Jan 29, 2025
58686fb
Extent parallelization notebook
wenzeslaus Jan 30, 2025
37bd6d4
Add typing to the init function of MaskManager
wenzeslaus Jan 30, 2025
8903969
Add GRASS_MASK to env vars doc
wenzeslaus Jan 30, 2025
2706bec
Document best practices more specifically for Python.
wenzeslaus Jan 30, 2025
18d9e56
Add doc and examples to Python docstring for the MaskManager
wenzeslaus Jan 30, 2025
ca89681
Add more context to the notebook example
wenzeslaus Jan 30, 2025
43c3476
More complex expression, but creating strictly 0-or-1 mask.
wenzeslaus Jan 30, 2025
bbd6ae0
Unqualifying map name in Python needs a function
wenzeslaus Jan 30, 2025
e6440b6
Fix typo
wenzeslaus Jan 30, 2025
cd9ebee
Move general comments to general section, add examples, improve wording
wenzeslaus Jan 31, 2025
99b7edb
Merge remote-tracking branch 'upstream/main' into mask-env-var
wenzeslaus Jan 31, 2025
d62b1e5
Apply clear spelling/wording fixes
wenzeslaus Feb 3, 2025
6fc315d
Merge branch 'main' into mask-env-var
wenzeslaus Feb 3, 2025
5697586
Rewrite whole section about different masks
wenzeslaus Feb 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 82 additions & 6 deletions doc/development/style_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -463,15 +463,21 @@ The `--overwrite` flag can be globally enabled by setting the environment variab
#### Mask

GRASS GIS has a global mask managed by the _r.mask_ tool and represented by a
raster called MASK. Raster tools called as a subprocess will automatically
raster called MASK by default. Raster tools called as a subprocess will automatically
respect the globally set mask when reading the data. For outputs, respecting of
the mask is optional.

Tools **should not set or remove the global mask**. If the tool cannot avoid
setting the mask internally, it should check for presence of the mask and fail
if the mask is present. The tools should not remove and later restore the
original mask because that creates confusing behavior for interactive use and
breaks parallel processing.
Tools should generally respect the global mask set by a user. If the mask set by the
user is not respected by a tool, the exact behavior should be described in the
documentation. On the other hand, ignoring mask is usually the desired behavior
for import tools which corresponds with the mask being applied only when reading
existing raster data in a project.

Tools **should not set or remove the global mask** to prevent unintended
behavior during interactive sessions and to maintain parallel processing
integrity. If a tool requires a mask for its operation, it should implement
a temporary mask using _MaskManager_ in Python or by setting the `GRASS_MASK`
environment variable.

Generally, any mask behavior should be documented unless it is the standard case
where masked cells do not participate in the computation and are represented as
Expand Down Expand Up @@ -577,6 +583,76 @@ gs.run_command("r.slope.aspect", elevation=input_raster, slope=slope, env=env)
This approach makes the computational region completely safe for parallel
processes as no region-related files are modified.

#### Changing raster mask

The _MaskManager_ in Python API provides a way for tools to change, or possibly
to ignore, a raster mask for part of the computation.

In the following example, _MaskManager_ modifies the global system environment
for the tool (aka _os.environ_) so that custom mask can be applied:

```python
# Previously user-set mask applies here (if any).
gs.run_command("r.slope.aspect", elevation=input_raster, aspect=aspect)

with gs.MaskManager():
# Only the mask we set here will apply.
gs.run_command("r.mask", raster=mask_raster)
gs.run_command("r.slope.aspect", elevation=input_raster, slope=slope)
# Mask is disabled and the mask raster is removed at the end of the with block.

# Previously user-set mask applies here again.
```

Because tools should generally respect the provided mask, the mask in a tool
should act as an additional mask. This can be achieved when preparing the new
mask raster using a tool which reads an existing raster:

```python
# Here we create an initial mask by creating a raster from vector,
# but that does not use mask.
gs.run_command(
"v.to.rast", input=input_vector, where="name == 'Town'", output=town_boundary
)
# So, we use a raster algebra expression. Mask will be applied if set
# because in the expression, we are reading an existing raster.
gs.mapcalc(f"{raster_mask} = {town_boundary}")

with gs.MaskManager():
gs.run_command("r.mask", raster=mask_raster)
# Both user mask and the town_boundary are used here.
gs.run_command("r.slope.aspect", elevation=input_raster, slope=slope)
```

To disable the mask, which may be needed in processing steps of import tool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean here with import tool? all the examples are with r.slope.aspect

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "an import tool". I could refer to r.in.wms as an example, but tgat relies an that code not changing. Making up an example seem like a lot theoretical code, but maybe the r.in.wms alpha is a good example. Appling a cloud mask would be another example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the "an" sounds good. Something like: "... in the processing steps of, e.g., an import tool, ..." to make it clear that it's just an example

we can do:

```python
# Mask applies here if set.
gs.run_command("r.slope.aspect", elevation=input_raster, aspect=aspect)

with gs.MaskManager():
# No mask was set in this context, so the tool runs without a mask.
gs.run_command("r.slope.aspect", elevation=input_raster, slope=slope)

# Mask applies again.
```

If needed, tools can implement optional support for a user-set raster mask by
passing or not passing the current name of a mask obtained from _r.mask.status_
and by preparing the internal mask raster beforehand with the user mask active.

If different subprocesses, running in parallel, use different masks,
it is best to create mask rasters beforehand (to avoid limitations of _r.mask_ and
the underlying _r.reclass_ tool). The name of the mask raster can then be passed to
the manager:

```python
env = os.environ.copy()
with gs.MaskManager(mask_name=mask_raster, env=env):
gs.run_command("r.slope.aspect", elevation=input_raster, slope=slope, env=env)
```

#### Temporary Maps

Using temporary maps is preferred over using temporary mapsets. This follows the
Expand Down
67 changes: 67 additions & 0 deletions doc/examples/notebooks/parallelization_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -459,6 +459,73 @@
"\n",
"Alternatively, a PostgreSQL or another database backend can be used for attributes to offload the parallel writing to the database system."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "df444278",
"metadata": {},
"source": [
"#### Safely modifying raster mask in a single mapset"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "a8d52c4c",
"metadata": {},
"source": [
"Mask is by default specified per-mapset and shared by all the processes. Additionally, *r.mask* is using *r.reclass* in the background which may cause issues if the mask is derived from the same base map in parallel. The use of *MaskManager* in the following example allows each process to use a different raster. The raster is used directly as a mask to avoid the need to use *r.mask*.\n",
"\n",
"The following code derives basins (watersheds) based on a threshold from the digital elevation model. Then, for each basin, it computes the topographic index with *r.topidx*."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2f72dcca",
"metadata": {},
"outputs": [],
"source": [
"%%writefile example.py\n",
"elevation = \"elev_state_500m\"\n",
"gs.run_command(\"g.region\", raster=elevation)\n",
"gs.run_command(\"r.watershed\", elevation=elevation, basin=\"basins\", threshold=10000)\n",
"\n",
"cats = gs.parse_command(\"r.describe\", map=\"basins\", flags=\"1n\", format=\"json\")[\"values\"]\n",
"\n",
"def topidx(cat):\n",
" # Define output name and mask name.\n",
" output = f\"topidx_{cat}\"\n",
" basin = f\"basin_{cat}\"\n",
" # Extract subwatershed by category into separate raster, creating\n",
" # a 0-or-1 mask which has no NULLs (although NULLs in mask are allowed).\n",
" gs.mapcalc(f\"{basin} = if(isnull(basins), 0, if(basins == {cat}, 1, 0))\")\n",
" # Create a copy of the environment for this process before modifying it.\n",
" env = os.environ.copy()\n",
" # Set the computational region to match non-null area in the new raster.\n",
" env[\"GRASS_REGION\"] = gs.region_env(raster=\"basins\", zoom=basin, env=env)\n",
" # Use mask context manager to specify which raster to use as a mask\n",
" # and pass the environment we are using.\n",
" with gs.MaskManager(mask_name=basin, env=env):\n",
" # Run actual computation with active mask.\n",
" gs.run_command(\"r.topidx\", input=elevation, output=output, env=env)\n",
" return output\n",
"\n",
"with Pool(processes=4) as pool:\n",
" outputs = pool.map(topidx, cats)\n",
" print(outputs)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8c6cfb3b",
"metadata": {},
"outputs": [],
"source": [
"%run example.py"
]
}
],
"metadata": {
Expand Down
5 changes: 5 additions & 0 deletions lib/init/variables.html
Original file line number Diff line number Diff line change
Expand Up @@ -512,6 +512,11 @@ <h3>List of selected internal GRASS environment variables</h3>
This allows programs such as the GUI to run external commands on an
alternate region without having to modify the WIND file then change it
back afterwards.</dd>

<dt>GRASS_MASK</dt>
<dd>[libgis]<br>
use the raster map specified by name as mask, instead of a raster called
MASK in the current mapset.</dd>
</dl>

<h2>List of selected GRASS gisenv variables</h2>
Expand Down
26 changes: 15 additions & 11 deletions lib/raster/auto_mask.c
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,13 @@
*
* \brief Raster Library - Auto masking routines.
*
* (C) 2001-2008 by the GRASS Development Team
* (C) 2001-2024 by Vaclav Petras and the GRASS Development Team
*
* This program is free software under the GNU General Public License
* (>=v2). Read the file COPYING that comes with GRASS for details.
*
* \author GRASS GIS Development Team
*
* \date 1999-2008
* \author Vaclav Petras (environmental variable and refactoring)
*/

#include <stdlib.h>
Expand All @@ -31,46 +30,51 @@
* \return 0 if mask unset or unavailable
* \return 1 if mask set and available and ready to use
*/

int Rast__check_for_auto_masking(void)
{
struct Cell_head cellhd;

Rast__init();

/* if mask is switched off (-2) return -2
if R__.auto_mask is not set (-1) or set (>=0) recheck the MASK */
if R__.auto_mask is not set (-1) or set (>=0) recheck the mask */

// TODO: This needs to be documented or modified accordingly.
if (R__.auto_mask < -1)
return R__.auto_mask;

/* if(R__.mask_fd > 0) G_free (R__.mask_buf); */

/* look for the existence of the MASK file */
R__.auto_mask = (G_find_raster("MASK", G_mapset()) != 0);
/* Decide between default mask name and env var specified one. */
char *mask_name = Rast_mask_name();
char *mask_mapset = "";

/* Check for the existence of the mask raster. */
R__.auto_mask = (G_find_raster2(mask_name, mask_mapset) != 0);

if (R__.auto_mask <= 0)
return 0;

/* check MASK projection/zone against current region */
Rast_get_cellhd("MASK", G_mapset(), &cellhd);
/* Check mask raster projection/zone against current region */
Rast_get_cellhd(mask_name, mask_mapset, &cellhd);
if (cellhd.zone != G_zone() || cellhd.proj != G_projection()) {
R__.auto_mask = 0;
return 0;
}

if (R__.mask_fd >= 0)
Rast_unopen(R__.mask_fd);
R__.mask_fd = Rast__open_old("MASK", G_mapset());
R__.mask_fd = Rast__open_old(mask_name, mask_mapset);
if (R__.mask_fd < 0) {
R__.auto_mask = 0;
G_warning(_("Unable to open automatic MASK file"));
G_warning(_("Unable to open automatic mask <%s>"), mask_name);
return 0;
}

/* R__.mask_buf = Rast_allocate_c_buf(); */

R__.auto_mask = 1;
G_free(mask_name);

return 1;
}
Expand Down
4 changes: 2 additions & 2 deletions lib/raster/get_row.c
Original file line number Diff line number Diff line change
Expand Up @@ -764,7 +764,7 @@ void Rast_get_d_row_nomask(int fd, DCELL *buf, int row)
* two particular types check the functions).
* - Step 4: read or simmulate null value row and zero out cells
* corresponding to null value cells. The masked out cells are set to null when
* the mask exists. (the MASK is taken care of by null values (if the null file
* the mask exists. (the mask is taken care of by null values (if the null file
* doesn't exist for this map, then the null row is simulated by assuming that
* all zero are nulls *** in case of Rast_get_row() and assuming that all data
* is valid in case of G_get_f/d_raster_row(). In case of deprecated function
Expand Down Expand Up @@ -1089,7 +1089,7 @@ static void embed_nulls(int fd, void *buf, int row, RASTER_MAP_TYPE map_type,

Read or simulate null value row and set the cells corresponding
to null value to 1. The masked out cells are set to null when the
mask exists. (the MASK is taken care of by null values
mask exists. (the mask is taken care of by null values
(if the null file doesn't exist for this map, then the null row
is simulated by assuming that all zeros in raster map are nulls.
Also all masked out cells become nulls.
Expand Down
Loading
Loading