-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement roll_monthday, simplify SemiMonthOffset #18762
Conversation
pandas/_libs/tslibs/offsets.pyx
Outdated
@@ -844,6 +841,42 @@ cpdef int get_day_of_month(datetime other, day_opt) except? -1: | |||
raise ValueError(day_opt) | |||
|
|||
|
|||
def _roll_monthday(n, other, compare): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be de-privatized
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you cpdef this? and type things, add a doc-string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you have 2 possibilities for types (they must be both either integers or dateimes), then have 2 functions and name it that way, much more readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. One consideration that might push in the other direction if we want to get rid of duplicate code: roll_yearday
and roll_monthday
are basically special cases of roll_qtrday
with modby
of 12
and 1
, respectively. Merging them would mean doing some unnecessary mod
calls, no idea what the perf hit would be.
pandas/_libs/tslibs/offsets.pyx
Outdated
return n | ||
|
||
|
||
cpdef inline int roll_qtrday(other, n, month, day_opt='start', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
once have cleaned code, show perf benchmarks for this (and add if we don't have appropriate ones). |
Looks like SemiMonthOffset gets the slight bump we'd expect and everything else is noise.
|
Codecov Report
@@ Coverage Diff @@
## master #18762 +/- ##
==========================================
- Coverage 91.58% 91.57% -0.01%
==========================================
Files 150 150
Lines 48972 48933 -39
==========================================
- Hits 44851 44812 -39
Misses 4121 4121
Continue to review full report at Codecov.
|
pandas/_libs/tslibs/offsets.pyx
Outdated
---------- | ||
other : datetime or Timestamp | ||
n : number of periods to increment, before adjusting for rolling | ||
day_opt : 'start', 'end', 'business_start', 'business_end' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this does not match the signature
|
||
Parameters | ||
---------- | ||
other : datetime or Timestamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs typing
Parameters | ||
---------- | ||
other : datetime or Timestamp | ||
n : number of periods to increment, before adjusting for rolling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does not match the signature
Travis error is in test_geopandas; will wait to re-push until given the go-ahead. |
I pushed a fix for geopandas you can rebase |
pandas/_libs/tslibs/offsets.pyx
Outdated
@@ -777,6 +773,7 @@ cpdef datetime shift_month(datetime stamp, int months, object day_opt=None): | |||
return stamp.replace(year=year, month=month, day=day) | |||
|
|||
|
|||
# TODO: Can we declare this so it will take datetime _or_ pandas_datetimestruct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no please don't functions should not take different types like this. have 2 functions. we specifically generate templates when we have to cover lots of dtypes (not suggesting this at all here though).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, will remove this comment and address similar things you've mentioned.
pandas/_libs/tslibs/offsets.pyx
Outdated
---------- | ||
other : datetime or int | ||
n : number of periods to increment, before adjusting for rolling | ||
compare : datetime or int (must match `other`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would really prefer not to do this, have separate functions. (compare)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, am separating them now. With some luck (and possibly help to confirm that the complexity in the getOffsetOfMonth
methods is unneeded) we'll be able to get rid of the need to handle both cases in liboffsets.
@@ -2127,6 +2081,7 @@ def apply(self, other): | |||
n -= 1 | |||
elif n < 0 and other > current_easter: | |||
n += 1 | |||
# TODO: Why does this handle the 0 case the opposite of others? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea if this is intentional?
Timeout |
Will this be easier if split into smaller pieces? |
pandas/_libs/tslibs/offsets.pyx
Outdated
@@ -827,7 +823,55 @@ cpdef int get_day_of_month(datetime other, day_opt) except? -1: | |||
raise ValueError(day_opt) | |||
|
|||
|
|||
cpdef int roll_yearday(other, n, month, day_opt='start') except? -1: | |||
cpdef int _roll_convention(int other, int n, int compare): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not very intutive, nor should you generally be calling this as a public function of a cython routine. name this something else or create separate functions so compare is more readable here. I don't even mind repeating some code, as trying to shove multiple functions into the same one is simply a bad idea here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this comment. This is in reference to _roll_convention? I can rename it I guess, but I don't see how it could be separated into more specific functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see below
@@ -1122,21 +1097,21 @@ def rule_code(self): | |||
|
|||
@apply_wraps | |||
def apply(self, other): | |||
n = self.n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other : int, generally the day component of a datetime | ||
n : number of periods to increment, before adjusting for rolling | ||
compare : int, generally the day component of a datetime, in the same | ||
month as the datetime form which `other` was taken. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is very very confusing. you are using this in 2 separate ways. I would prefer 2 functions one for day one for month
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So is the issue with the docstring in roll_convention? or the name? I dont think the function itself can be broken down any further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this concern be ameliorated if roll_monthday
were removed? I expect all of the places where it is used are eventually going to be simplified/fixed to use roll_convention
(or something equivalent)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine, but you are using this two differen types so make
roll_monthday_interger
, roll_monthday_datetime
or whatever. have 1 function do 1 thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible the diff is obfuscating the fact that roll_monthday was split several commits back into roll_monthday
(which takes datetimes) and roll_convention
(which takes ints)?
pandas/_libs/tslibs/offsets.pyx
Outdated
@@ -827,7 +823,55 @@ cpdef int get_day_of_month(datetime other, day_opt) except? -1: | |||
raise ValueError(day_opt) | |||
|
|||
|
|||
cpdef int roll_yearday(other, n, month, day_opt='start') except? -1: | |||
cpdef int _roll_convention(int other, int n, int compare): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see below
This is becoming a blocker for fixing bugs in SemiMonthOffsets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see some testing roll_monthday
, roll_qtrday
, and roll_convention
like you have for roll_yearday in test_liboffsets.py
pandas/_libs/tslibs/offsets.pyx
Outdated
|
||
|
||
cpdef int roll_yearday(datetime other, int n, int month, | ||
object day_opt='start') except? -1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make not-optional
pandas/_libs/tslibs/offsets.pyx
Outdated
|
||
|
||
cpdef int roll_qtrday(datetime other, int n, int month, day_opt='start', | ||
int modby=3) except? -1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make day_opt non-optional
pandas/_libs/tslibs/offsets.pyx
Outdated
@@ -827,7 +823,55 @@ cpdef int get_day_of_month(datetime other, day_opt) except? -1: | |||
raise ValueError(day_opt) | |||
|
|||
|
|||
cpdef int roll_yearday(other, n, month, day_opt='start') except? -1: | |||
cpdef int _roll_convention(int other, int n, int compare): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. nice tests!
only minor nitpick and can be fixed later (or here if you want). is that the signature for shift_months and the roll_* use slightly different terms, e.g.
shift_months(datetime stamp, int months)
while
roll_monthday(datetime other, int n)
so maybe make this consistent by making the args
datetime value, int n
(roll_convention
has an int first arg so can't really call this stamp
)
Definitely later. Some of this has crept into the scope of #18959, and |
ok, make a note of that, and rebase. ping on green. |
ping |
thanks! |
git diff upstream/master -u -- "*.py" | flake8 --diff