-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement business_start/end cases for shift_months #18489
Conversation
(will run asv shortly) |
pandas/_libs/tslibs/ccalendar.pxd
Outdated
|
||
from numpy cimport int64_t, int32_t | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we call this calendar.pxd
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid overlap with stdlib names. The thought behind ccalendar
is by analogy with chardet
--> cchardet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think that's actually a problem, I guess ccalendar is ok.
pandas/_libs/tslibs/ccalendar.pyx
Outdated
@cython.boundscheck(False) | ||
@cython.cdivision | ||
cdef int dayofweek(int y, int m, int d) nogil: | ||
"""Sakamoto's method, from wikipedia""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc-string
pandas/_libs/tslibs/ccalendar.pyx
Outdated
@cython.wraparound(False) | ||
@cython.boundscheck(False) | ||
cpdef monthrange(int64_t year, Py_ssize_t month): | ||
cdef: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc-string
pandas/_libs/tslibs/ccalendar.pyx
Outdated
|
||
|
||
cdef int is_leapyear(int64_t year) nogil: | ||
"""Returns 1 if the given year is a leap year, 0 otherwise.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc-string
pandas/_libs/tslibs/ccalendar.pxd
Outdated
from numpy cimport int64_t, int32_t | ||
|
||
|
||
cpdef monthrange(int64_t year, Py_ssize_t month) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so monthrange is actually used in a few places. changing to use this one? (similar with other methods you defined), e.g. its defined in tslib.pyx
now (as well as normalize_date
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is. I figured I'd way to update the other usages until after you pointed it out, keep the diff smaller for the first round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no I'd rather see the far reaching changes (if any) now. Ideally in a PR you change a small number of things, but clean them up everywhere (e.g. replace usages of monthrange). sure its not always possible, but then its not 'half-done'
pandas/_libs/tslibs/ccalendar.pyx
Outdated
# ---------------------------------------------------------------------- | ||
# Constants | ||
|
||
# Slightly more performant cython lookups than a 2D table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then add that the first 12 are non-leap years and second are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
OK, I've addressed all the comments locally, am running benchmarks (several times) and fixed the travis flake problem. Doing some extra cleanup and parametrization of the tests to make sure the affected offset methods are covered, will update later. |
@@ -929,8 +929,9 @@ def name(self): | |||
if self.isAnchored: | |||
return self.rule_code | |||
else: | |||
month = liboffsets._int_to_month[self.n] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prob should de-privatize these in offsets (_int_to_month
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
OK, just confirmed: the existing tests do not hit the new |
what is the perf issue here? |
I'm still running asvs to try to pin it down. AFAICT something in ccalendar must be less performant than the version in src/datetime
b244bab9 is after updating all the cimports as suggested, except for one in |
465c7e2
to
d3ff628
Compare
Codecov Report
@@ Coverage Diff @@
## master #18489 +/- ##
==========================================
- Coverage 91.33% 91.32% -0.02%
==========================================
Files 163 163
Lines 49801 49773 -28
==========================================
- Hits 45487 45454 -33
- Misses 4314 4319 +5
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #18489 +/- ##
===========================================
- Coverage 91.33% 40.77% -50.57%
===========================================
Files 163 163
Lines 49801 49796 -5
===========================================
- Hits 45487 20304 -25183
- Misses 4314 29492 +25178
Continue to review full report at Codecov.
|
Separating the ccalendar stuff from the new apply_index and onOffset methods. The newly-pushed commit should be an unambiguous win.
Running again. The 100x improvements are real. The slowdowns I expect to be noise. |
what path was BusinessMonthEnd for example taking before that made it a lot slower? all in python? |
The base class applyindex raises, at which point it falls back to point wise addition |
this needs a note for performance in whats for 0.22.0. ping when pushed as this is green already. |
add an additional line (even though you have one for #18218). in general any PR that is perf related should get an entry (or you can include that PR number on an existing entry). cleaning/reorg PR's that don't touch the user don't need ones. |
ping |
thanks! |
Unifies
apply_index
implementations for MonthEnd/MonthBegin, plus extends them to BMonthEnd and BMonthBegin.Unifies
onOffset
implementations for QuarterEnd/BQuarterEnd, plus extends them to QuarterBegin/BQuarterBegin.Implements a
cdef
version ofmonthrange
.