Skip to content

Commit

Permalink
[pre-commit.ci] auto fixes from pre-commit.com hooks
Browse files Browse the repository at this point in the history
for more information, see https://pre-commit.ci
  • Loading branch information
pre-commit-ci[bot] authored and dcherian committed Mar 31, 2022
1 parent 96c6b49 commit d56a22f
Showing 1 changed file with 32 additions and 21 deletions.
53 changes: 32 additions & 21 deletions docs/source/user-stories/custom-aggregations.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,16 @@
"source": [
"# Custom Aggregations\n",
"\n",
"This notebook is motivated by a [post](https://discourse.pangeo.io/t/using-xhistogram-to-bin-measurements-at-particular-stations/2365/4) on the Pangeo discourse forum.\n",
"This notebook is motivated by a\n",
"[post](https://discourse.pangeo.io/t/using-xhistogram-to-bin-measurements-at-particular-stations/2365/4)\n",
"on the Pangeo discourse forum.\n",
"\n",
"> Even better would be a command that lets me simply do the following.\n",
">\n",
"> A = da.groupby(['lon_bins', 'lat_bins']).mode()\n",
"\n",
"This notebook will describe how to accomplish this using a custom `Aggregation` since `mode` and `median` aren't supported by flox yet."
"This notebook will describe how to accomplish this using a custom `Aggregation`\n",
"since `mode` and `median` aren't supported by flox yet.\n"
]
},
{
Expand Down Expand Up @@ -439,7 +442,7 @@
"source": [
"## A built-in reduction\n",
"\n",
"First a simple example of lat-lon binning using a built-in reduction: mean"
"First a simple example of lat-lon binning using a built-in reduction: mean\n"
]
},
{
Expand Down Expand Up @@ -494,9 +497,12 @@
"source": [
"## Aggregations\n",
"\n",
"flox knows how to interperet `func=\"mean\"` because it's been implemented in `aggregations.py` as an [Aggregation](https://flox.readthedocs.io/en/latest/generated/flox.aggregations.Aggregation.html)\n",
"flox knows how to interperet `func=\"mean\"` because it's been implemented in\n",
"`aggregations.py` as an\n",
"[Aggregation](https://flox.readthedocs.io/en/latest/generated/flox.aggregations.Aggregation.html)\n",
"\n",
"An `Aggregation` is a blueprint for computing an aggregation, with both numpy and dask data."
"An `Aggregation` is a blueprint for computing an aggregation, with both numpy\n",
"and dask data.\n"
]
},
{
Expand Down Expand Up @@ -545,19 +551,19 @@
"```python\n",
"mean = Aggregation(\n",
" name=\"mean\",\n",
" \n",
" # strings in the following are built-in grouped reductions \n",
"\n",
" # strings in the following are built-in grouped reductions\n",
" # implemented by the underlying \"engine\": flox or numpy_groupies or numbagg\n",
" \n",
"\n",
" # for pure numpy inputs\n",
" numpy=\"mean\", \n",
" \n",
" numpy=\"mean\",\n",
"\n",
" # The next are for dask inputs and describe how to reduce\n",
" # the data in parallel\n",
" chunk=(\"sum\", \"nanlen\"), # first compute these blockwise : (grouped_sum, grouped_count)\n",
" combine=(\"sum\", \"sum\"), # reduce intermediate reuslts (sum the sums, sum the counts)\n",
" finalize=lambda sum_, count: sum_ / count, # final mean value (divide sum by count)\n",
" \n",
"\n",
" fill_value=(0, 0), # fill value for intermediate sums and counts when groups have no members\n",
" dtypes=(None, np.intp), # optional dtypes for intermediates\n",
" final_dtype=np.floating, # final dtype for output\n",
Expand All @@ -572,18 +578,21 @@
"source": [
"## Defining a custom aggregation\n",
"\n",
"First we'll need a function that executes the grouped reduction given numpy inputs. \n",
"First we'll need a function that executes the grouped reduction given numpy\n",
"inputs.\n",
"\n",
"Custom functions are required to have this signature (copied form\n",
"numpy_groupies):\n",
"\n",
"Custom functions are required to have this signature (copied form numpy_groupies):\n",
"``` python\n",
"```python\n",
"\n",
"def custom_grouped_reduction(\n",
" group_idx, array, *, axis=-1, size=None, fill_value=None, dtype=None\n",
"):\n",
" \"\"\"\n",
" Parameters\n",
" ----------\n",
" \n",
"\n",
" group_idx : np.ndarray, 1D\n",
" integer codes for group labels (1D)\n",
" array : np.ndarray, nD\n",
Expand All @@ -596,17 +605,19 @@
" fill_value for when number groups in group_idx is less than size\n",
" dtype : optional\n",
" dtype of output\n",
" \n",
"\n",
" Returns\n",
" -------\n",
" \n",
"\n",
" np.ndarray with array.shape[-1] == size, containing a single value per group\n",
" \"\"\"\n",
" pass\n",
"```\n",
"\n",
"\n",
"Since numpy_groupies does not implement a median, we'll do it ourselves by passing `np.median` to `numpy_groupies.aggregate_numpy.aggregate`. This will loop over all groups, and then execute `np.median` on the group members in serial. It is not fast, but quite convenient.\n"
"Since numpy_groupies does not implement a median, we'll do it ourselves by\n",
"passing `np.median` to `numpy_groupies.aggregate_numpy.aggregate`. This will\n",
"loop over all groups, and then execute `np.median` on the group members in\n",
"serial. It is not fast, but quite convenient.\n"
]
},
{
Expand Down Expand Up @@ -639,7 +650,7 @@
"id": "b356f4f2-ae22-4f56-89ec-50646136e2eb",
"metadata": {},
"source": [
"Now we create the `Aggregation`"
"Now we create the `Aggregation`\n"
]
},
{
Expand Down Expand Up @@ -682,7 +693,7 @@
"id": "899ece52-ebd4-47b4-8090-cbbb63f504a4",
"metadata": {},
"source": [
"And apply it!"
"And apply it!\n"
]
},
{
Expand Down

0 comments on commit d56a22f

Please sign in to comment.