Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load_forecasts_repo(models = NULL) returns a data.frame where row order depends on locale #291

Closed
Bisaloo opened this issue Aug 11, 2021 · 0 comments
Assignees

Comments

@Bisaloo
Copy link
Contributor

Bisaloo commented Aug 11, 2021

You might decide that this is not your problem and the user is in charge of ensuring locale consistency across platforms and this would be a fair answer but I figured I could still open an issue just in case.
As illustrated in the reprex below, load_forecasts() output is locale-dependent because it is ordered according to the model column (when the models argument is absent from the load_forecasts() call) and this order change depending on the locale.
This causes unnecessary diffs (e.g., european-modelling-hubs/covid19-forecast-hub-europe_archive@be9803c) and potential confusion ("why am not getting the exact same result?") when contributors from different regions of the world / using different locales run the same piece of code.

I think an easy fix would be to change:

all_valid_models <- list.dirs(file_path, full.names = FALSE)

to

  all_valid_models <- sort(list.dirs(file_path, full.names = FALSE), method = "radix")

which always sorts according to a C locale and is thus locale-independent.

Reprex:

library(covidHubUtils)
library(dplyr)

withr::with_locale(c(LC_COLLATE = "fr_FR.UTF-8"), {
  d <- load_forecasts(
    source = "local_hub_repo",
    hub_repo_path = here::here(),
    hub = "ECDC"
  ) 
})

d %>% digest::digest()
#> [1] "e6329c2844d0ca6aa3e57c020b9509f8"
d %>%
  filter(model %in% c("MUNI-ARIMA", "MUNI_DMS-SEIAR")) %>%
  pull(model) %>%
  unique()
#> [1] "MUNI_DMS-SEIAR" "MUNI-ARIMA"


withr::with_locale(c(LC_COLLATE = "C"), {
  d <- load_forecasts(
    source = "local_hub_repo",
    hub_repo_path = here::here(),
    hub = "ECDC"
  ) 
})

d %>% digest::digest()
#> [1] "1a811419d52bb2d632b287ab88ffc7fd"

d %>%
  filter(model %in% c("MUNI-ARIMA", "MUNI_DMS-SEIAR")) %>%
  pull(model) %>%
  unique()
#> [1] "MUNI-ARIMA"     "MUNI_DMS-SEIAR"
@Bisaloo Bisaloo changed the title load_forecasts_repo() returns a data.frame where row order depends on locale load_forecasts_repo(models = NULL) returns a data.frame where row order depends on locale Aug 11, 2021
@Serena-Wang Serena-Wang self-assigned this Aug 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants