Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement shift_quarters --> apply_index for quarters and years #18522

Merged
merged 7 commits into from
Nov 27, 2017

Conversation

jbrockmendel
Copy link
Member

@jbrockmendel jbrockmendel commented Nov 27, 2017

tslibs.offsets.shift_quarters should look like the the lovechild of offsets.shift_months and QuarterOffset.apply. It may be possible to de-duplicate some of that* at some point.

We get apply_index methods for all four YearOffset subclasses and all four QuarterOffset subclasses. (Previously there were implementations for the non-business versions, but they went through a code path that I found fragile). BeginMixin is no longer used, removed.

All 12 of the (non-custom) Month+Quarter+Year offset subclasses are now parametrized stubs with the actual implementations unified in parent classes.

Small cleanups and notes in offsets.

* It would be convenient to be able to specify a function to accept either a datetime or a pandas_datetimestruct in cases where they quack alike. I tried to do a ctypedef fused datetimelike but cython objected. Any thoughts?

asv continuous -f 1.1 -E virtualenv master HEAD -b offset
[...]
       before           after         ratio
     [262e8ff3]       [21504bfd]
-      18.4±0.9μs      16.7±0.06μs     0.91  offset.CBDay.time_custom_bday_apply_dt64
-        519±20ms        458±0.3ms     0.88  offset.SemiMonthOffset.time_end_apply_index
-        77.7±6ms      67.7±0.09ms     0.87  timeseries.ToDatetime.time_cache_false_with_dup_string_tzoffset_dates
-        542±20ms          459±2ms     0.85  offset.SemiMonthOffset.time_begin_incr_rng
-         209±3μs        132±0.2μs     0.63  offset.CBMonthBegin.time_custom_bmonthbegin_incr_n
-      21.5±0.1ms         6.82±1ms     0.32  offset.ApplyIndex.time_apply_series(<YearBegin: month=1>)
-      23.4±0.6ms      6.82±0.09ms     0.29  offset.ApplyIndex.time_apply_series(<QuarterBegin: startingMonth=3>)
-      22.0±0.5ms      5.17±0.09ms     0.23  offset.ApplyIndex.time_apply_index(<QuarterBegin: startingMonth=3>)
-      19.8±0.4ms       4.64±0.1ms     0.23  offset.ApplyIndex.time_apply_index(<YearBegin: month=1>)
-         487±1ms      8.13±0.06ms     0.02  offset.ApplyIndex.time_apply_series(<YearEnd: month=12>)
-        502±20ms      7.14±0.07ms     0.01  offset.ApplyIndex.time_apply_series(<QuarterEnd: startingMonth=3>)
-       494±0.4ms       6.02±0.2ms     0.01  offset.ApplyIndex.time_apply_index(<YearEnd: month=12>)
-        555±20ms       5.64±0.2ms     0.01  offset.ApplyIndex.time_apply_index(<QuarterEnd: startingMonth=3>)
-           1.52s         11.3±1ms     0.01  offset.ApplyIndex.time_apply_series(<BusinessYearEnd: month=12>)
-           1.41s       8.17±0.2ms     0.01  offset.ApplyIndex.time_apply_series(<BusinessYearBegin: month=1>)
-           1.82s       10.0±0.2ms     0.01  offset.ApplyIndex.time_apply_series(<BusinessQuarterEnd: startingMonth=3>)
-           1.47s       8.07±0.6ms     0.01  offset.ApplyIndex.time_apply_index(<BusinessQuarterEnd: startingMonth=3>)
-           1.57s       8.49±0.2ms     0.01  offset.ApplyIndex.time_apply_series(<BusinessQuarterBegin: startingMonth=3>)
-           1.66s      8.35±0.09ms     0.01  offset.ApplyIndex.time_apply_index(<BusinessYearEnd: month=12>)
-           1.53s       6.51±0.1ms     0.00  offset.ApplyIndex.time_apply_index(<BusinessQuarterBegin: startingMonth=3>)
-           1.63s      6.21±0.06ms     0.00  offset.ApplyIndex.time_apply_index(<BusinessYearBegin: month=1>)

Running a larger version now.

  • closes #xxxx
  • tests added / passed --> apply_index for all relevant classes, make sure to hit negative n
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@jbrockmendel
Copy link
Member Author

Travis error looks unrelated:

_______ TestDataFramePlots.test_parallel_coordinates_with_sorted_labels ________
[gw0] linux -- Python 3.6.3 /home/travis/miniconda3/envs/pandas/bin/python
self = <pandas.tests.plotting.test_misc.TestDataFramePlots object at 0x7ff86427ea20>
    def test_parallel_coordinates_with_sorted_labels(self):
        """ For #15908 """
        from pandas.plotting import parallel_coordinates
        df = DataFrame({"feat": [i for i in range(30)],
                        "class": [2 for _ in range(10)] +
                        [3 for _ in range(10)] +
                        [1 for _ in range(10)]})
        ax = parallel_coordinates(df, 'class', sort_labels=True)
        polylines, labels = ax.get_legend_handles_labels()
        color_label_tuples = \
            zip([polyline.get_color() for polyline in polylines], labels)
        ordered_color_label_tuples = sorted(color_label_tuples,
                                            key=lambda x: x[1])
        prev_next_tupels = zip([i for i in ordered_color_label_tuples[0:-1]],
                               [i for i in ordered_color_label_tuples[1:]])
        for prev, nxt in prev_next_tupels:
            # lables and colors are ordered strictly increasing
>           assert prev[1] < nxt[1] and prev[0] < nxt[0]
E           TypeError: '<' not supported between instances of 'list' and 'str'
pandas/tests/plotting/test_misc.py:221: TypeError

@gfyoung gfyoung added Enhancement Datetime Datetime data dtype labels Nov 27, 2017
@gfyoung
Copy link
Member

gfyoung commented Nov 27, 2017

Travis error looks unrelated:

Maybe, though as you know, we need all green to merge. I'll double-check that the error wasn't spurious.

@codecov
Copy link

codecov bot commented Nov 27, 2017

Codecov Report

Merging #18522 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18522      +/-   ##
==========================================
+ Coverage   91.32%   91.33%   +<.01%     
==========================================
  Files         163      163              
  Lines       49798    49780      -18     
==========================================
- Hits        45479    45465      -14     
+ Misses       4319     4315       -4
Flag Coverage Δ
#multiple 89.13% <100%> (+0.02%) ⬆️
#single 40.8% <33.33%> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/tseries/offsets.py 96.9% <100%> (-0.05%) ⬇️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.81% <0%> (-0.1%) ⬇️
pandas/core/indexes/datetimes.py 95.71% <0%> (+0.09%) ⬆️
pandas/plotting/_converter.py 65.25% <0%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 262e8ff...a8c6a9c. Read the comment docs.

@codecov
Copy link

codecov bot commented Nov 27, 2017

Codecov Report

Merging #18522 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18522      +/-   ##
==========================================
+ Coverage   91.32%   91.33%   +<.01%     
==========================================
  Files         163      163              
  Lines       49798    49780      -18     
==========================================
- Hits        45479    45465      -14     
+ Misses       4319     4315       -4
Flag Coverage Δ
#multiple 89.13% <100%> (+0.02%) ⬆️
#single 40.8% <33.33%> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/tseries/offsets.py 96.9% <100%> (-0.05%) ⬇️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.81% <0%> (-0.1%) ⬇️
pandas/core/indexes/datetimes.py 95.71% <0%> (+0.09%) ⬆️
pandas/plotting/_converter.py 65.25% <0%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 262e8ff...a8c6a9c. Read the comment docs.

@jbrockmendel
Copy link
Member Author

Not sure how it got to green given that I didn’t re-push...

@gfyoung
Copy link
Member

gfyoung commented Nov 27, 2017

Not sure how it got to green given that I didn’t re-push...

@jbrockmendel : I said "I'll double-check that the error wasn't spurious" i.e. I reran the build 😄

@jreback jreback added this to the 0.22.0 milestone Nov 27, 2017
@jreback
Copy link
Contributor

jreback commented Nov 27, 2017

lgtm. ready to go?

Copy link
Member

@gfyoung gfyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel
Copy link
Member Author

i.e. I reran the build

Neat, thanks.

@jbrockmendel
Copy link
Member Author

More thorough asv run:

taskset 2 asv continuous -f 1.1 -E virtualenv master HEAD -b offset -b timedelta -b period -b timeseries
[...]
       before           after         ratio
     [262e8ff3]       [21504bfd]
+     19.4±0.04μs       22.5±0.7μs     1.16  offset.SemiMonthOffset.time_end_incr_n
-           40.7s            32.7s     0.80  gil.nogil_datetime_fields.time_period_to_datetime
-        23.8±1ms       7.11±0.2ms     0.30  offset.ApplyIndex.time_apply_series(<QuarterBegin: startingMonth=3>)
-      22.5±0.7ms      6.39±0.07ms     0.28  offset.ApplyIndex.time_apply_series(<YearBegin: month=1>)
-      21.8±0.4ms       5.28±0.2ms     0.24  offset.ApplyIndex.time_apply_index(<QuarterBegin: startingMonth=3>)
-      21.1±0.5ms       4.67±0.1ms     0.22  offset.ApplyIndex.time_apply_index(<YearBegin: month=1>)
-         507±9ms       7.46±0.4ms     0.01  offset.ApplyIndex.time_apply_series(<QuarterEnd: startingMonth=3>)
-        549±40ms       7.82±0.1ms     0.01  offset.ApplyIndex.time_apply_series(<YearEnd: month=12>)
-         508±6ms      5.95±0.06ms     0.01  offset.ApplyIndex.time_apply_index(<YearEnd: month=12>)
-        513±10ms       5.49±0.1ms     0.01  offset.ApplyIndex.time_apply_index(<QuarterEnd: startingMonth=3>)
-           1.43s       10.0±0.3ms     0.01  offset.ApplyIndex.time_apply_series(<BusinessQuarterEnd: startingMonth=3>)
-           1.56s       10.3±0.2ms     0.01  offset.ApplyIndex.time_apply_series(<BusinessYearEnd: month=12>)
-           1.48s       8.57±0.2ms     0.01  offset.ApplyIndex.time_apply_index(<BusinessYearEnd: month=12>)
-           1.50s       8.51±0.2ms     0.01  offset.ApplyIndex.time_apply_series(<BusinessQuarterBegin: startingMonth=3>)
-           1.49s       7.91±0.3ms     0.01  offset.ApplyIndex.time_apply_index(<BusinessQuarterEnd: startingMonth=3>)
-           1.51s       7.87±0.2ms     0.01  offset.ApplyIndex.time_apply_series(<BusinessYearBegin: month=1>)
-           1.53s      6.76±0.08ms     0.00  offset.ApplyIndex.time_apply_index(<BusinessQuarterBegin: startingMonth=3>)
-           1.49s       6.13±0.2ms     0.00  offset.ApplyIndex.time_apply_index(<BusinessYearBegin: month=1>)

@jreback jreback merged commit 88ab693 into pandas-dev:master Nov 27, 2017
@jreback
Copy link
Contributor

jreback commented Nov 27, 2017

thanks!

@jbrockmendel jbrockmendel deleted the tslibs-offsets7 branch December 8, 2017 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants