-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make DateOffset immutable #21341
make DateOffset immutable #21341
Conversation
Codecov Report
@@ Coverage Diff @@
## master #21341 +/- ##
=========================================
Coverage ? 91.91%
=========================================
Files ? 153
Lines ? 49550
Branches ? 0
=========================================
Hits ? 45546
Misses ? 4004
Partials ? 0
Continue to review full report at Codecov.
|
object.__setattr__(self, "_cache", {}) | ||
|
||
def __setattr__(self, name, value): | ||
raise AttributeError("DateOffset objects are immutable.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The standard approach is usually to use private attribute like _n
and provide access via a properties. Is there a reason why that wouldn't make sense here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, seems reasonable to me.
OK, its working again. |
@jbrockmendel ok pls rebase, I think you have some residual comments about normalize (your PR was merged so maybe need changing) |
@@ -304,6 +304,15 @@ class _BaseOffset(object): | |||
_day_opt = None | |||
_attributes = frozenset(['n', 'normalize']) | |||
|
|||
def __init__(self, n=1, normalize=False): | |||
n = self._validate_n(n) | |||
object.__setattr__(self, "n", n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just set these as regular attributes, I doubt this actually makes a signfiicant perf difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand. These are regular attributes, but the PR overrides __setattr__
so we have to use object.__setattr__
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh sorry you are right.
can you set these as cdef readonly? (and maybe type them) or not yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel what pickle issue are you referring to in that change? That just looks like a master tracker but not sure the exact issue contained within
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
n = self._validate_n(n) | ||
object.__setattr__(self, "n", n) | ||
object.__setattr__(self, "normalize", normalize) | ||
object.__setattr__(self, "_cache", {}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is _cache for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't create this explicitly in __init__
then @cache_readonly
lookups will try to create it, which will raise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you put this on the class? (or is that really weird)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pd.DateOffset._cache = {}
off = pd.DateOffset(n=1)
hour = pd.offsets.BusinessHour(n=2)
>>> off._cache is hour._cache
True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right so we can't use the @cache_readonly
decorator here? (or is that what you are doing by pre-emptively setting the cache)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cache_readonly.__get__
checks for a _cache
attribute and creates one if it does not exist. Creating it in __init__
ensures that one exists, so a new one does not have to be created (since attempting to do so would raise)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.e. this is necessary in order to retain the existing usages of cache_readonly
on DateOffset
subclasses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok makes sense
@jbrockmendel ok this is ok. merge before your set_state/get_state change? does it matter? |
There will be a merge conflict I'll need to address. Since I'm holding out hope WillAyd can help make #18224 viable (and this unnecessary), I'd rather the the getstate/setstate one get merged first. |
ok dokie |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs a whatsnew as now technically DateOffsets are immutable yes? do we have some direct tests for this?
added |
thanks @jbrockmendel |
Returning to the long-standing goal of making
DateOffset
s immutable (recall:DateOffset.__eq__
callsDateOffset._params
which is very slow._params
can't be cached ATM becauseDateOffset
is mutable`)Earlier attempts in this direction tried to make the base class a
cdef class
, but that has run intopickle
problems that I haven't been able to sort out so far. This PR goes the patch-__setattr__
route instead.Note: this PR does not implement the caching that is the underlying goal.
I'm likely to make some other PRs in this area, will try to keep them orthogonal.