Skip to content

Commit

Permalink
Merge branch 'master' into contribution_overhaul
Browse files Browse the repository at this point in the history
* master:
  Rework whatsnew into new scheme. (SciTools#3834)
  Lazy regridding with Linear, Nearest, and AreaWeighted (SciTools#3701)
  Iris readme minimal (SciTools#3833)
  • Loading branch information
tkknight committed Sep 11, 2020
2 parents a606c6a + b04a3ca commit e959fdf
Show file tree
Hide file tree
Showing 15 changed files with 318 additions and 111 deletions.
90 changes: 5 additions & 85 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@


<h4 align="center">
Iris is a powerful, format-agnostic, community-driven Python library for
Iris is a powerful, format-agnostic, community-driven Python package for
analysing and visualising Earth science data
</h4>

Expand All @@ -15,7 +15,7 @@
<img src="https://api.travis-ci.org/repositories/SciTools/iris.svg?branch=master"
alt="Travis-CI" /></a>
<a href='https://scitools-iris.readthedocs.io/en/latest/?badge=latest'>
<img src='https://readthedocs.org/projects/scitools-iris/badge/?version=latest'
<img src='https://readthedocs.org/projects/scitools-iris/badge/?version=latest'
alt='Documentation Status' /></a>
<a href="https://anaconda.org/conda-forge/iris">
<img src="https://img.shields.io/conda/dn/conda-forge/iris.svg"
Expand All @@ -39,87 +39,7 @@
<img src="https://img.shields.io/badge/code%20style-black-000000.svg"
alt="black" /></a>
</p>
<br>

<!-- NOTE: toc auto-generated with https://github.com/frnmst/md-toc:
$ md_toc github README.md -i
-->

<h1>Table of contents</h1>

[](TOC)

+ [Overview](#overview)
+ [Documentation](#documentation)
+ [Installation](#installation)
+ [Copyright and licence](#copyright-and-licence)
+ [Get in touch](#get-in-touch)
+ [Contributing](#contributing)

[](TOC)

# Overview

Iris implements a data model based on the [CF conventions](http://cfconventions.org/)
giving you a powerful, format-agnostic interface for working with your data.
It excels when working with multi-dimensional Earth Science data, where tabular
representations become unwieldy and inefficient.

[CF Standard names](http://cfconventions.org/standard-names.html),
[units](https://github.com/SciTools/cf_units), and coordinate metadata
are built into Iris, giving you a rich and expressive interface for maintaining
an accurate representation of your data. Its treatment of data and
associated metadata as first-class objects includes:

* a visualisation interface based on [matplotlib](https://matplotlib.org/) and
[cartopy](https://scitools.org.uk/cartopy/docs/latest/),
* unit conversion,
* subsetting and extraction,
* merge and concatenate,
* aggregations and reductions (including min, max, mean and weighted averages),
* interpolation and regridding (including nearest-neighbor, linear and area-weighted), and
* operator overloads (``+``, ``-``, ``*``, ``/``, etc.)

A number of file formats are recognised by Iris, including CF-compliant NetCDF, GRIB,
and PP, and it has a plugin architecture to allow other formats to be added seamlessly.

Building upon [NumPy](http://www.numpy.org/) and [dask](https://dask.pydata.org/en/latest/),
Iris scales from efficient single-machine workflows right through to multi-core clusters and HPC.
Interoperability with packages from the wider scientific Python ecosystem comes from Iris'
use of standard NumPy/dask arrays as its underlying data storage.


# Documentation

<a href="https://scitools.org.uk/iris/docs/latest/index.html"> <img src="https://img.shields.io/badge/docs-stable-green.svg" alt="Stable docs" /></a> The documentation for *stable released versions* of Iris, including a user guide, example code, and gallery.

<a href="https://scitools-docs.github.io/iris/master/index.html"> <img src="https://img.shields.io/badge/docs-latest-blue.svg" alt="Latest docs" /></a> The documentation for the *latest development version* of Iris.


# Installation

The easiest way to install Iris is with [conda](https://conda.io/miniconda.html):

conda install -c conda-forge iris

Detailed instructions, including information on installing from source,
are available in the
[documentation](https://scitools-iris.readthedocs.io/en/latest/installing.html).

# Get in touch

* Report bugs, or suggest new features using an Issue or Pull Request on [Github](https://github.com/SciTools/iris). You can also comment on existing Issues and Pull Requests.
* For discussions from a user perspective you could join our [SciTools Users Google Group](https://groups.google.com/forum/#!forum/scitools-iris).
* For those involved in developing Iris we also have an [Iris Developers Google Group](https://groups.google.com/forum/#!forum/scitools-iris-dev).
* [StackOverflow](https://stackoverflow.com/questions/tagged/python-iris) For "How do I?".

# Copyright and licence

Iris may be freely distributed, modified and used commercially under the terms
of its [GNU LGPLv3 license](COPYING.LESSER).

# Contributing
Information on how to contribute can be found in the
[Iris developer guide](https://scitools.org.uk/iris/docs/latest/index.html#development-index).

(C) British Crown Copyright 2010 - 2020, Met Office
<p align="center">
See the <a href="https://scitools-iris.readthedocs.io/en/latest/">documentation</a> for the <b>latest development version</b> of Iris.
</P>
2 changes: 1 addition & 1 deletion docs/iris/src/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Iris Documentation

.. todolist::

**A powerful, format-agnostic, community-driven Python library for analysing
**A powerful, format-agnostic, community-driven Python package for analysing
and visualising Earth science data.**

Iris implements a data model based on the `CF conventions <http://cfconventions.org>`_
Expand Down
11 changes: 6 additions & 5 deletions docs/iris/src/userguide/citation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,23 @@
Citing Iris
===========

If Iris played an important part in your research then please add us to your reference list by using one of the recommendations below.
If Iris played an important part in your research then please add us to your
reference list by using one of the recommendations below.

************
BibTeX entry
BibTeX entry
************

For example::

@manual{Iris,
author = {{Met Office}},
title = {Iris: A Python library for analysing and visualising meteorological and oceanographic data sets},
title = {Iris: A Python package for analysing and visualising meteorological and oceanographic data sets},
edition = {v1.2},
year = {2010 - 2013},
address = {Exeter, Devon },
url = {http://scitools.org.uk/}
}
}


*******************
Expand All @@ -45,7 +46,7 @@ Suggested format::

For example::

Iris. Met Office. git@github.com:SciTools/iris.git 06-03-2013
Iris. Met Office. git@github.com:SciTools/iris.git 06-03-2013

.. _How to cite and describe software: http://software.ac.uk/so-exactly-what-software-did-you-use

Expand Down
47 changes: 47 additions & 0 deletions docs/iris/src/userguide/interpolation_and_regridding.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,11 @@ The following are the regridding schemes that are currently available in Iris:
* nearest-neighbour regridding (:class:`iris.analysis.Nearest`), and
* area-weighted regridding (:class:`iris.analysis.AreaWeighted`, first-order conservative).

The linear, nearest-neighbor, and area-weighted regridding schemes support
lazy regridding, i.e. if the source cube has lazy data, the resulting cube
will also have lazy data.
See :doc:`real_and_lazy_data` for an introduction to lazy data.


.. _interpolation:

Expand Down Expand Up @@ -409,3 +414,45 @@ regridded to the target grid. For example::
In each case ``result`` will be the input cube regridded to the grid defined by
the target grid cube (in this case ``rotated_psl``) that we used to define the
cached regridder.

Regridding lazy data
^^^^^^^^^^^^^^^^^^^^

If you are working with large cubes, especially when you are regridding to a
high resolution target grid, you may run out of memory when trying to
regrid a cube. When this happens, make sure the input cube has lazy data

>>> air_temp = iris.load_cube(iris.sample_data_path('A1B_north_america.nc'))
>>> air_temp
<iris 'Cube' of air_temperature / (K) (time: 240; latitude: 37; longitude: 49)>
>>> air_temp.has_lazy_data()
True

and the regridding scheme supports lazy data. All regridding schemes described
here support lazy data. If you still run out of memory even while using lazy
data, inspect the
`chunks <https://docs.dask.org/en/latest/array-chunks.html>`__
:

>>> air_temp.lazy_data().chunks
((240,), (37,), (49,))

The cube above consist of a single chunk, because it is fairly small. For
larger cubes, iris will automatically create chunks of an optimal size when
loading the data. However, because regridding to a high resolution grid
may dramatically increase the size of the data, the automatically chosen
chunks might be too large.

As an example of how to solve this, we could manually re-chunk the time
dimension, to regrid it in 8 chunks of 30 timesteps at a time:

>>> air_temp.data = air_temp.lazy_data().rechunk([30, None, None])
>>> air_temp.lazy_data().chunks
((30, 30, 30, 30, 30, 30, 30, 30), (37,), (49,))

Assuming that Dask is configured such that it processes only a few chunks of
the data array at a time, this will further reduce memory use.

Note that chunking in the horizontal dimensions is not supported by the
regridding schemes. Chunks in these dimensions will automatically be combined
before regridding.
4 changes: 3 additions & 1 deletion docs/iris/src/userguide/real_and_lazy_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,9 @@ In Iris, lazy data is provided as a
`dask array <https://docs.dask.org/en/latest/array.html>`_.
A dask array also has a shape and data type
but the dask array's data points remain on disk and only loaded into memory in
small chunks when absolutely necessary. This has key performance benefits for
small
`chunks <https://docs.dask.org/en/latest/array-chunks.html>`__
when absolutely necessary. This has key performance benefits for
handling large amounts of data, where both calculation time and storage
requirements can be significantly reduced.

Expand Down
5 changes: 5 additions & 0 deletions docs/iris/src/whatsnew/latest.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,11 @@ Features
saved to NetCDF-CF files. Support for `Quality Flags`_ is also provided to
ensure they load and save with appropriate units. See :pull:`3800`.

* Lazy regridding with the :class:`~iris.analysis.Linear`,
:class:`~iris.analysis.Nearest`, and
:class:`~iris.analysis.AreaWeighted` regridding schemes.
See :pull:`3701`.


Dependency Updates
==================
Expand Down
36 changes: 36 additions & 0 deletions lib/iris/_lazy_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -349,3 +349,39 @@ def lazy_elementwise(lazy_array, elementwise_op):
dtype = elementwise_op(np.zeros(1, lazy_array.dtype)).dtype

return da.map_blocks(elementwise_op, lazy_array, dtype=dtype)


def map_complete_blocks(src, func, dims, out_sizes):
"""Apply a function to complete blocks.
Complete means that the data is not chunked along the chosen dimensions.
Args:
* src (:class:`~iris.cube.Cube`):
Source cube that function is applied to.
* func:
Function to apply.
* dims (tuple of int):
Dimensions that cannot be chunked.
* out_sizes (tuple of int):
Output size of dimensions that cannot be chunked.
"""
if not src.has_lazy_data():
return func(src.data)

data = src.lazy_data()

# Ensure dims are not chunked
in_chunks = list(data.chunks)
for dim in dims:
in_chunks[dim] = src.shape[dim]
data = data.rechunk(in_chunks)

# Determine output chunks
out_chunks = list(data.chunks)
for dim, size in zip(dims, out_sizes):
out_chunks[dim] = size

return data.map_blocks(func, chunks=out_chunks, dtype=src.dtype)
16 changes: 16 additions & 0 deletions lib/iris/analysis/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2440,6 +2440,10 @@ def regridder(self, src_grid, target_grid):
constructing your own regridder is preferable. These are detailed in
the :ref:`user guide <caching_a_regridder>`.
Supports lazy regridding. Any
`chunks <https://docs.dask.org/en/latest/array-chunks.html>`__
in horizontal dimensions will be combined before regridding.
Args:
* src_grid:
Expand Down Expand Up @@ -2514,6 +2518,10 @@ def regridder(self, src_grid_cube, target_grid_cube):
constructing your own regridder is preferable. These are detailed in
the :ref:`user guide <caching_a_regridder>`.
Supports lazy regridding. Any
`chunks <https://docs.dask.org/en/latest/array-chunks.html>`__
in horizontal dimensions will be combined before regridding.
Args:
* src_grid_cube:
Expand Down Expand Up @@ -2630,6 +2638,10 @@ def regridder(self, src_grid, target_grid):
constructing your own regridder is preferable. These are detailed in
the :ref:`user guide <caching_a_regridder>`.
Supports lazy regridding. Any
`chunks <https://docs.dask.org/en/latest/array-chunks.html>`__
in horizontal dimensions will be combined before regridding.
Args:
* src_grid:
Expand Down Expand Up @@ -2716,6 +2728,8 @@ def regridder(self, src_cube, target_grid):
constructing your own regridder is preferable. These are detailed in
the :ref:`user guide <caching_a_regridder>`.
Does not support lazy regridding.
Args:
* src_cube:
Expand Down Expand Up @@ -2791,6 +2805,8 @@ def regridder(self, src_grid, target_grid):
constructing your own regridder is preferable. These are detailed in
the :ref:`user guide <caching_a_regridder>`.
Does not support lazy regridding.
Args:
* src_grid:
Expand Down
9 changes: 9 additions & 0 deletions lib/iris/analysis/_area_weighted.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,9 @@ def __call__(self, cube):
The given cube must be defined with the same grid as the source
grid used to create this :class:`AreaWeightedRegridder`.
If the source cube has lazy data, the returned cube will also
have lazy data.
Args:
* cube:
Expand All @@ -89,6 +92,12 @@ def __call__(self, cube):
this cube will be converted to values on the new grid using
area-weighted regridding.
.. note::
If the source cube has lazy data,
`chunks <https://docs.dask.org/en/latest/array-chunks.html>`__
in the horizontal dimensions will be combined before regridding.
"""
src_x, src_y = get_xy_dim_coords(cube)
if (src_x, src_y) != self._src_grid:
Expand Down
Loading

0 comments on commit e959fdf

Please sign in to comment.