Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent error when downloading records #180

Closed
shandiya opened this issue Jan 23, 2023 · 2 comments
Closed

Inconsistent error when downloading records #180

shandiya opened this issue Jan 23, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@shandiya
Copy link
Contributor

Downloading records using atlas_occurrences() is sometimes unsuccessful, but this behaviour is not consistently reproducible. Sometimes the query successfully downloads if it is re-run a few minutes later, and sometimes not at all. This makes it unclear if the error is related to the size of the query.

galah version
galah_1.5.1

library(galah)  
library(purrr)  
library(arrow)  
library(dplyr)  

galah_config(email = Sys.getenv("email"), verbose = TRUE)

y <- tibble(name =
                c("id",
                  "lft",
                  "rgt", 
                  "speciesID",
                  "taxonConceptID",
                  "kingdom",
                  "phylum",
                  "class",
                  "order",
                  "family",
                  "genus",
                  "species",
                  "scientificName",
                  "vernacularName",
                  "infraspecificEpithet",
                  "taxonRank",
                  "decimalLatitude",
                  "decimalLongitude",
                  "basisOfRecord",
                  "year",
                  "eventID",
                  "dataResourceUid",
                  "dataResourceName",
                  "samplingProtocol",
                  "cl111032",
                  "cl11032",
                  "cl111033",
                  "cl11033",
                  "cl10902",
                  "cl10000",
                  "cl22",
                  "cl1048",
                  "cl966"), 
              type = "field")
  
  attr(y, "call") <- "galah_select"
  
  x <- galah_call() |>
    galah_apply_profile(ALA) |> 
    galah_filter(year == 2019,
                 decimalLatitude != "",
                 decimalLongitude != "",
                 speciesID != "")
  
  x$select <- y
  
  x |> 
    atlas_occurrences() |>
    write_parquet(sink = "data/galah/occ_2019")

Sometimes the download is successful, and sometimes the output is:

This query will return 5,295,950 records

Checking queue
Current queue size: 1 inqueue ... running .......................................Error: need one of url or handle
@shandiya shandiya added the bug Something isn't working label Jan 23, 2023
@fontikar
Copy link

I am getting this same error too!

 lampromicra<- galah_call() %>%
+   galah_identify("Lampromicra aerea", "Lampromicra senator"
+                  ) %>%
+   galah_filter( year > 1950 & year <= 2022)%>%
+   galah_select(species)%>%
+   atlas_occurrences()
This query will return 820 records

Checking queue
Current queue size: 4 inqueue  running .........................................Error: need one of url or handle

@daxkellie
Copy link
Collaborator

This error should be fixed by using new collapse(), compute() & collect() architecture in galah 2.0.0. Users can now send queries to API with compute(), then download later with collect(). This should avoid the time-out error that seemed to be getting hit sporadically. For example:

# Create and send query to be calculated server-side
request <- request_data("occurrences") |>
  identify("perameles") |>
  filter(year > 1900) |>
  compute()
  
# Download data
request |>
  collect()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants