Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove arg that is only ever used as NPY_UNSAFE_CASTING #18546

Merged
merged 1 commit into from
Nov 29, 2017

Conversation

jbrockmendel
Copy link
Member

Several functions from src/datetime are only ever called with the casting rule NPY_UNSAFE_CASTING. By getting rid of that dummy argument, the remaining code gets simplified quite a bit.

This PR removes that argument, then removes code that this renders unreachable or unused. It also removes several commented-out functions.

There are a couple of other never-used args; taking these one at a time.

@codecov
Copy link

codecov bot commented Nov 28, 2017

Codecov Report

Merging #18546 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18546      +/-   ##
==========================================
- Coverage   91.35%   91.33%   -0.02%     
==========================================
  Files         164      164              
  Lines       49802    49802              
==========================================
- Hits        45496    45487       -9     
- Misses       4306     4315       +9
Flag Coverage Δ
#multiple 89.13% <ø> (ø) ⬆️
#single 40.81% <ø> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.81% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2a0e54b...02e4e8b. Read the comment docs.

@codecov
Copy link

codecov bot commented Nov 28, 2017

Codecov Report

Merging #18546 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18546      +/-   ##
==========================================
- Coverage   91.35%   91.33%   -0.02%     
==========================================
  Files         164      164              
  Lines       49802    49802              
==========================================
- Hits        45496    45487       -9     
- Misses       4306     4315       +9
Flag Coverage Δ
#multiple 89.13% <ø> (ø) ⬆️
#single 40.81% <ø> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.81% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2a0e54b...02e4e8b. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented Nov 29, 2017

can you run an asv set on timeseries and see if anything changes.

@jreback jreback added Clean Datetime Datetime data dtype labels Nov 29, 2017
@@ -444,16 +272,6 @@ int parse_iso_8601_datetime(char *str, int len, PANDAS_DATETIMEUNIT unit,
*out_special = 1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need an assertion here that unit is FR_ns?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To double-check, what line are you referring to? The answer is "no" regardless, but the reason why may vary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_iso_8601_datetimetime takes a unit arg, is this vaidated that it is only ever FR_ns

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. The answer is "no" and the reason is because 1) the unit arg is no longer used once we've gotten rid of the casting arg (so it'll be removed in a follow-up) and 2) parse_iso_8601_datetime is only ever called from src/datetime.pxd and in that one case it is with PANDAS_FR_ns.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that's what I figured, ok if you are removing then this is ok.

@jbrockmendel
Copy link
Member Author

can you run an asv set on timeseries and see if anything changes.

Starting now.

@jbrockmendel
Copy link
Member Author

taskset 2 asv continuous -f 1.1 -E virtualenv master HEAD -b timeseries
[...]
       before           after         ratio
     [2a0e54bc]       [02e4e8bf]
+      19.0±0.1ms            135ms     7.10  timeseries.DatetimeIndex.time_dti_tz_factorize
+        54.9±5ms         117±20ms     2.12  timeseries.DatetimeIndex.time_dti_factorize
+           1.61s            1.93s     1.20  timeseries.AsOfDataFrame.time_asof
-     2.49±0.01ms      2.20±0.02ms     0.88  timeseries.ToDatetime.time_cache_true_with_dup_string_dates
-      7.85±0.4μs      6.75±0.04μs     0.86  timeseries.DatetimeIndex.time_timestamp_tzinfo_cons
-      25.3±0.6μs       20.2±0.2μs     0.80  offset.Day.time_timeseries_day_incr
-           1.83s        1.12±0.2s     0.61  timeseries.AsOfDataFrame.time_asof_nan

asof and dti_factorize I've gotten use to completely ignoring. Re-running anyway, but this looks like noise.

@jbrockmendel
Copy link
Member Author

       before           after         ratio
     [2a0e54bc]       [02e4e8bf]
+     7.81±0.07ms         23.8±3ms     3.05  timeseries.DatetimeIndex.time_dti_tz_factorize
+         389±2μs         444±20μs     1.14  timeseries.DatetimeIndex.time_reset_index
-         134±8μs        113±0.1μs     0.85  timeseries.DatetimeIndex.time_unique
-         162±2ms          137±2ms     0.85  timeseries.ToDatetime.time_iso8601_tz_spaceformat
-        27.6±2ms       22.4±0.2ms     0.81  timeseries.DatetimeIndex.time_to_pydatetime

@jbrockmendel
Copy link
Member Author

      before           after         ratio
     [2a0e54bc]       [02e4e8bf]
+        62.1±2ms         101±30ms     1.63  timeseries.AsOfDataFrame.time_asof_nan
+     1.47±0.01ms      1.77±0.03ms     1.20  timeseries.ToDatetime.time_cache_false_with_dup_string_dates_and_format
+      13.3±0.4μs       15.2±0.1μs     1.14  offset.YearBegin.time_timeseries_year_incr
-      19.8±0.1μs       17.6±0.2μs     0.89  timeseries.AsOf.time_asof_single_early
-      3.96±0.2ms      3.44±0.03ms     0.87  timeseries.ToDatetime.time_cache_true_with_unique_seconds_and_unit
-      2.00±0.1ms         1.69±0ms     0.84  timeseries.DatetimeIndex.time_add_timedelta

@jbrockmendel
Copy link
Member Author

There are a few of these; hoping to get them out of the way before doing away with datetime.pxd

@jreback jreback added this to the 0.22.0 milestone Nov 29, 2017
@jreback jreback merged commit d3c3c2b into pandas-dev:master Nov 29, 2017
@jreback
Copy link
Contributor

jreback commented Nov 29, 2017

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Datetime Datetime data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants