Speed improvement for `roll_median` #44

roaldarbol · 2024-09-19T10:27:57Z

Hi @jasonjfoster!
Thanks for this amazing package first of all! I'm doing a bunch of benchmarks (I'll update once I've published them as a blog post), and roll is right up there with data.table for rolling mean and sum, and the best for min and max.

So I was surprised to see that it actually is rather slow for the rolling median - slower even than zoo for larger window widths. I've also tested the RollingWindow package, which is the fastest (an order of magnitude faster than roll), and the only one that seems not to increase time with larger window widths, so maybe it's possible to get some inspiration from their implementation to improve the speed? The package has been abandoned seemingly, and can only be installed from Github. I'm aware that it relies on roll_quantile which doesn't have an online implementation, which explains why it's slower, but it still surprised me.

I really like your philosophy of not making rolling functions for a ton of different things, but focusing on ensuring the ones you make are fast, which is why I thought it might be worthwhile exploring a re-implementation of roll_median as it's such a common function (also used in a lot of smoothing, e.g. why it's in robfilter).

Here's the reprex of my test:

library(tibble)
library(dplyr)
library(ggplot2)
library(tinytable)
library(tidyr)
library(microbenchmark)

library(zoo)
library(RollingWindow)
library(RcppRoll)
library(roll)
library(robfilter)

df <- tibble(x = rnorm(1000000))
windows <- c(5, 11, 15, 21, 31, 41, 51, 61, 71, 81, 91, 101, 201, 301)

df_median <- tibble()
for (n in windows){
  n_half <- floor(n/2)
  res <- microbenchmark(
    "RollingWindow::RollingMedian" = RollingWindow::RollingMedian(df$x, n),
    "zoo::rollmedian" = zoo::rollmedian(df$x, k = n, fill = NA),
    "roll::roll_median" = roll::roll_median(df$x, width = n),
    "RcppRoll::roll_median" = RcppRoll::roll_median(df$x, n = n, fill = NA),
    "robfilter::med.filter" = robfilter::med.filter(df$x, n, online = TRUE),
    times = 5) |> 
    mutate(window = n)
  
  df_median <- dplyr::bind_rows(df_median, res)
}

# Plot it
df_median_summarised <- df_median |> 
  group_by(expr, window) |>
  summarise(time = median(time) / 1000000)
#> `summarise()` has grouped output by 'expr'. You can override using the
#> `.groups` argument.
df_median |> 
  mutate(time = time / 1000000) |> 
  ggplot(aes(window, time, colour =  expr)) +
  geom_jitter(alpha = 0.1) +
  geom_line(data = df_median_summarised)

df_median_summarised |> 
  pivot_wider(id_cols = window,
              names_from = expr,
              values_from = time) |> 
  tt()

window	RollingWindow::RollingMedian	zoo::rollmedian	roll::roll_median	RcppRoll::roll_median	robfilter::med.filter
5	42.61039	371.7149	21.73139	90.1882	1219.056
11	61.25106	350.1357	52.76195	228.6518	1213.023
15	56.37555	382.3029	68.21748	308.9450	1444.169
21	87.25495	435.8322	155.37194	470.8706	1526.992
31	58.60071	369.7318	290.60045	768.0161	1574.961
41	72.95761	350.7423	391.01673	1237.7658	2014.765
51	72.24422	354.0627	509.26535	1567.5037	2518.410
61	83.62925	451.1759	537.93466	1892.1354	2167.003
71	74.74441	403.2428	710.35575	2200.2911	2344.463
81	93.68993	472.4890	997.73558	3088.5824	2430.890
91	65.75451	459.3611	959.18494	3328.7882	2810.957
101	81.16433	411.2182	1124.10512	4231.1901	3041.096
201	78.60651	448.1946	2212.82532	7476.5215	3885.436
301	83.75928	451.0300	3043.77916	12806.6676	5035.164

^{Created on 2024-09-19 with reprex v2.1.1}

The text was updated successfully, but these errors were encountered:

jasonjfoster · 2024-09-19T13:16:19Z

Thanks for the feedback. I recently implemented efficient algorithms for the roll_median and roll_quantile functions in the development version 1.1.8 on GitHub:

# install.packages("devtools")
devtools::install_github("jasonjfoster/roll")

Also available on r-universe:

install.packages("roll" repos = "https://jasonjfoster.r-universe.dev")

Could you rerun the benchmarks with version 1.1.8?

roaldarbol · 2024-09-19T13:44:27Z

Great to hear! I just re-ran the tests, and it's definitely a lot faster for the larger windows (though slower for the smaller windows). It's still about 8 times slower than RollingWindow::RollingMedian unfortunately, and generally a bit slower than zoo::rollmedian.

library(tibble)
library(dplyr)
library(ggplot2)
library(tinytable)
library(tidyr)
library(microbenchmark)

library(zoo)
library(RollingWindow)
library(RcppRoll)
library(roll)
library(robfilter)
packageVersion("roll")
#> [1] '1.1.8'

df <- tibble(x = rnorm(1000000))
windows <- c(5, 11, 15, 21, 31, 41, 51, 61, 71, 81, 91, 101, 201, 301)

df_median <- tibble()
for (n in windows){
  n_half <- floor(n/2)
  res <- microbenchmark(
    "RollingWindow::RollingMedian" = RollingWindow::RollingMedian(df$x, n),
    "zoo::rollmedian" = zoo::rollmedian(df$x, k = n, fill = NA),
    "roll::roll_median" = roll::roll_median(df$x, width = n),
    "RcppRoll::roll_median" = RcppRoll::roll_median(df$x, n = n, fill = NA),
    "robfilter::med.filter" = robfilter::med.filter(df$x, n, online = TRUE),
    times = 5) |> 
    mutate(window = n)
  
  df_median <- dplyr::bind_rows(df_median, res)
}

# Plot it
df_median_summarised <- df_median |> 
  group_by(expr, window) |>
  summarise(time = median(time) / 1000000)
#> `summarise()` has grouped output by 'expr'. You can override using the
#> `.groups` argument.
df_median |> 
  mutate(time = time / 1000000) |> 
  ggplot(aes(window, time, colour =  expr)) +
  geom_jitter(alpha = 0.1) +
  geom_line(data = df_median_summarised)

df_median_summarised |> 
  pivot_wider(id_cols = window,
              names_from = expr,
              values_from = time) |> 
  tt()

window	RollingWindow::RollingMedian	zoo::rollmedian	roll::roll_median	RcppRoll::roll_median	robfilter::med.filter
5	46.69491	399.5131	299.6766	94.96266	1218.382
11	56.33853	380.3425	352.0274	231.49082	1283.153
15	62.86988	367.6014	375.6445	314.44852	1317.276
21	53.34275	344.6529	405.4176	457.00026	1501.322
31	62.57914	345.8289	453.3822	763.11941	1605.367
41	54.52699	402.9282	468.4996	1050.98745	1715.083
51	74.19872	371.9451	493.9300	1398.44475	1829.274
61	55.09602	383.8024	517.2597	1806.74099	1924.188
71	60.55978	324.0459	521.4181	2150.97881	2107.127
81	62.74747	443.4228	516.2781	2520.77028	2229.549
91	60.94327	389.7431	514.7131	2868.48722	2274.508
101	71.25430	366.4016	522.1390	3259.35205	2456.508
201	70.54210	408.8964	569.4769	7386.22923	3774.311
301	74.42444	431.5904	585.9865	11901.81444	4849.190

sessionInfo()
#> R version 4.4.0 (2024-04-24)
#> Platform: x86_64-apple-darwin20
#> Running under: macOS 15.0
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRblas.0.dylib 
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: Europe/Copenhagen
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] robfilter_4.1.5      lattice_0.22-6       MASS_7.3-60.2       
#>  [4] robustbase_0.99-4    roll_1.1.8           RcppRoll_0.3.1      
#>  [7] RollingWindow_0.2    zoo_1.8-12           microbenchmark_1.5.0
#> [10] tidyr_1.3.1          tinytable_0.4.0      ggplot2_3.5.1       
#> [13] dplyr_1.1.4          tibble_3.2.1        
#> 
#> loaded via a namespace (and not attached):
#>  [1] gtable_0.3.5       compiler_4.4.0     tidyselect_1.2.1   reprex_2.1.1      
#>  [5] Rcpp_1.0.13        scales_1.3.0       yaml_2.3.10        fastmap_1.2.0     
#>  [9] R6_2.5.1           labeling_0.4.3     generics_0.1.3     knitr_1.48        
#> [13] munsell_0.5.1      pillar_1.9.0       rlang_1.1.4        utf8_1.2.4        
#> [17] xfun_0.47          fs_1.6.4           RcppParallel_5.1.9 cli_3.6.3         
#> [21] withr_3.0.1        magrittr_2.0.3     digest_0.6.37      grid_4.4.0        
#> [25] rstudioapi_0.16.0  lifecycle_1.0.4    DEoptimR_1.1-3     vctrs_0.6.5       
#> [29] evaluate_0.24.0    glue_1.7.0         farver_2.1.2       fansi_1.0.6       
#> [33] colorspace_2.1-1   rmarkdown_2.28     purrr_1.0.2        tools_4.4.0       
#> [37] pkgconfig_2.0.3    htmltools_0.5.8.1

^{Created on 2024-09-19 with reprex v2.1.1}

jasonjfoster · 2024-09-19T17:42:02Z

Thanks for sharing the results. The RollingWindow::RollingMedian function uses a max-median-min heap structure in C (see https://gist.github.com/ashelly/5665911) and the zoo::rollmedian version (note: default is align = center but should be align = right or use zoo::rollmedianr for comparison purposes) is an interface to stats::runmed that uses efficient median implementations by Turlach and Stuetzle. Both are high-performance algorithms but specialized for median calculations and don’t support weights, missing values, or other quantiles.

In contrast, the roll::roll_quantile function is more flexible and uses two balanced multisets based on the quantile’s probability. It handles weights, missing values, and other quantiles within the window. There is some overhead due to parallel support via RcppParallel too. It’s great to see the recent optimizations have eliminated the scaling issues with window size and the roll package now performs on par with stats in the benchmark; however, after speaking with the data.table team, learned that it’s good practice to create benchmarks that take at least one second to run (see https://cloud.r-project.org/web/packages/data.table/vignettes/datatable-benchmarking.html#avoid-microbenchmark-times-100; note: microbenchmark's time value is measured in nanoseconds so should divide by 1e+9 for seconds). Might be challenging to break the one-second barrier with these highly efficient algorithms!

roaldarbol · 2024-09-25T13:58:18Z

Sorry about the delay in getting back to you. 😊
Completely agree that you implementation is much more flexible. And I also agree that the most important thing is fixed, namely the scaling issues, so I'm happy to close the issue; just one more question: You mention the alignment, is it technically possible to implement center and left aligned rolls with your implementation? The reason I'm asking is that a common use of rolling medians is as a smoothing filter in time series. There it's usually expected that the filter is centered to avoid a time shift. For left, it could just be running the right alignment on the reversed vector, and returning the re-reversed vector. For center it could maybe be done simply by shifting the returned vector by width/2?

Regarding benchmarking, I don't quite agree. The benchmarks I ran was on 1.000.000 observations, not a small number, and most users likely use it on fewer, but on multiple groups and data sets. Sure, I could run the benchmark on 1.000.000.000 observations, but I think the smaller number here is more realistic (the page you refer to also notes that "Main focus of benchmark should be on real use case scenarios"). I'd probably argue that both types of benchmarks offer insight of different types. And there's absolutely no need to run a ton of iterations if the difference is clear (the signal-to-noise ratio is good), but it certainly doesn't hurt either.

As an aside, it would be great to have an example in the README explaining the different ways roll handles NAs as it's unclear from the there alone, I personally had to dig through issues and PRs to understand it.

By the way, I hope this doesn't come across wrong - it's only because I find roll to be an amazing project that I'm thrilled to have come across!

jasonjfoster · 2024-09-27T13:15:20Z

Agree on benchmarks and need to consider the practicality of real use cases. This further illustrates the absolute speed improvements of the algorithms rather than just relative gains.

For the align request, please refer to the existing discussion and suggestions in this thread: #38.

Closing this issue now as the benchmarks have demonstrated speed improvements. Feel free to share the blog post whenever it’s ready.

roaldarbol · 2024-10-06T14:29:04Z

Have written the first version of the blog post. I'll update it along the way. E.g. I currently don't test scaling as I did above (I'll add that next week), and I'll also try to add a table that covers the features of the different packages. Feel free to comment or open an issue if there's something you feel should be added or taken into consideration that currently isn't. :-)

This comment was marked as outdated.

Sign in to view

jasonjfoster closed this as completed Sep 27, 2024

roaldarbol mentioned this issue Oct 7, 2024

rolling functions, rolling aggregates, sliding window, moving average Rdatatable/data.table#2778

Open

28 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed improvement for `roll_median` #44

Speed improvement for `roll_median` #44

roaldarbol commented Sep 19, 2024 •

edited

Loading

This comment was marked as outdated.

jasonjfoster commented Sep 19, 2024

roaldarbol commented Sep 19, 2024 •

edited

Loading

jasonjfoster commented Sep 19, 2024

roaldarbol commented Sep 25, 2024 •

edited

Loading

jasonjfoster commented Sep 27, 2024

roaldarbol commented Oct 6, 2024

Speed improvement for roll_median #44

Speed improvement for roll_median #44

Comments

roaldarbol commented Sep 19, 2024 • edited Loading

This comment was marked as outdated.

jasonjfoster commented Sep 19, 2024

roaldarbol commented Sep 19, 2024 • edited Loading

jasonjfoster commented Sep 19, 2024

roaldarbol commented Sep 25, 2024 • edited Loading

jasonjfoster commented Sep 27, 2024

roaldarbol commented Oct 6, 2024

Speed improvement for `roll_median` #44

Speed improvement for `roll_median` #44

roaldarbol commented Sep 19, 2024 •

edited

Loading

roaldarbol commented Sep 19, 2024 •

edited

Loading

roaldarbol commented Sep 25, 2024 •

edited

Loading