Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Add cummin() and cummax() functions for cumulative min/max calculation #3288

Merged
merged 27 commits into from
Jun 11, 2022

Conversation

samukweku
Copy link
Contributor

@samukweku samukweku commented May 30, 2022

WIP for #3279

@samukweku samukweku self-assigned this May 30, 2022
@oleksiyskononenko
Copy link
Contributor

Thanks @samukweku, I guess this could be easily templated to support both cummax() and cummin() at the same time. Look at how this is implemented for rowmax()/rowmin(), for example.

@oleksiyskononenko
Copy link
Contributor

oleksiyskononenko commented Jun 2, 2022

Look at this example. In the current implementation cummax() becomes zero once a missing value is encountered:

>>> from datatable import dt
>>> DT = dt.Frame([-1, None, 1])
>>> DT[:, dt.f.C0.cummax()]
   |    C0
   | int64
-- + -----
 0 |    -1
 1 |     0
 2 |     1
[3 rows x 1 column]

Once we fix the issues I've mentioned in the comments above the output will be

   |    C0
   | int64
-- + -----
 0 |    -1
 1 |    -1
 2 |     1
[3 rows x 1 column]

and that will be more consistent with the regular max() function.

@oleksiyskononenko
Copy link
Contributor

Btw, I've just pushed a fix for the sphinx build configuration. You should merge main onto your branch to make your build green.

samukweku and others added 8 commits June 2, 2022 16:05
Co-authored-by: oleksiyskononenko <35204136+oleksiyskononenko@users.noreply.github.com>
In sphinx config set language to `en` to silence the warning:
```
WARNING: Invalid configuration value found: 'language = None'. Update your configuration to a valid langauge code. Falling back to 'en' (English).
```
@samukweku
Copy link
Contributor Author

Thanks @samukweku, I guess this could be easily templated to support both cummax() and cummin() at the same time. Look at how this is implemented for rowmax()/rowmin(), for example.

thanks for the hint ... I'll give it a shot

@oleksiyskononenko
Copy link
Contributor

@samukweku the building error is not related to the changes on this PR, just some glitch with the jenkins filesystem:

[2022-06-04T00:58:55.741Z]     def test_fread_from_url2():
[2022-06-04T00:58:55.741Z]         import pathlib
[2022-06-04T00:58:55.741Z]         root = os.path.join(os.path.dirname(__file__), "..", "..")
[2022-06-04T00:58:55.741Z]         path = os.path.abspath(os.path.join(root, "LICENSE"))
[2022-06-04T00:58:55.741Z]         url = pathlib.Path(path).as_uri()
[2022-06-04T00:58:55.741Z]         d0 = dt.fread(url, sep="\n")
[2022-06-04T00:58:55.741Z]         frame_integrity_check(d0)
[2022-06-04T00:58:55.741Z] >       assert d0.shape == (372, 1)
[2022-06-04T00:58:55.741Z] E       assert (0, 0) == (372, 1)
[2022-06-04T00:58:55.741Z] E         At index 0 diff: 0 != 372
[2022-06-04T00:58:55.741Z] E         Full diff:
[2022-06-04T00:58:55.741Z] E         - (372, 1)
[2022-06-04T00:58:55.741Z] E         + (0, 0)

So just continue what you're doing, if the error persists, I will look into it.

@oleksiyskononenko oleksiyskononenko self-assigned this Jun 10, 2022
@oleksiyskononenko oleksiyskononenko added the new feature Feature requests for new functionality label Jun 10, 2022
@oleksiyskononenko oleksiyskononenko added this to the Release 1.1.0 milestone Jun 10, 2022
@oleksiyskononenko oleksiyskononenko added documentation test Add new tests, or fix existing tests labels Jun 10, 2022
@oleksiyskononenko
Copy link
Contributor

@samukweku Let me know if you have any questions on the templated implementation of cummin()/cummax() functions.

@oleksiyskononenko oleksiyskononenko changed the title [ENH] cumulative maximum implementation [ENH] Add cummin() and cummax() functions for cumulative min/max calculation Jun 10, 2022
@samukweku
Copy link
Contributor Author

thanks @oleksiyskononenko , I understand the template implementation. I'm sure I'll come back to it several times to study it more.

@samukweku
Copy link
Contributor Author

@oleksiyskononenko kindly have a look, when you can; I have updated the docs for cumin and added tests for f.cummin

@oleksiyskononenko oleksiyskononenko merged commit 4c1c031 into main Jun 11, 2022
@oleksiyskononenko oleksiyskononenko deleted the samukweku/cumulative/max branch June 11, 2022 04:34
oleksiyskononenko added a commit that referenced this pull request Jul 19, 2022
In #3288 we seem to miss datetime types. In this PR we add support for `date32` and `time64` types in `cummin()` and `cummax()` functions.

WIP for #3279
samukweku pushed a commit that referenced this pull request Aug 4, 2022
In #3288 we seem to miss datetime types. In this PR we add support for `date32` and `time64` types in `cummin()` and `cummax()` functions.

WIP for #3279
samukweku pushed a commit that referenced this pull request Aug 10, 2022
In #3288 we seem to miss datetime types. In this PR we add support for `date32` and `time64` types in `cummin()` and `cummax()` functions.

WIP for #3279
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation new feature Feature requests for new functionality test Add new tests, or fix existing tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants