Skip to content

Commit

Permalink
A memory bug fix
Browse files Browse the repository at this point in the history
  • Loading branch information
MarekGierlinski committed Jan 8, 2025
1 parent cea441f commit c494606
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 12 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: fenr
Title: Fast functional enrichment for interactive applications
Version: 1.3.1
Version: 1.5.1
Authors@R: person(
given = "Marek",
family = "Gierlinski",
Expand Down Expand Up @@ -53,6 +53,6 @@ Suggests:
knitr,
rmarkdown,
topGO
RoxygenNote: 7.3.1
RoxygenNote: 7.3.2
VignetteBuilder: knitr
LazyData: false
3 changes: 3 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,3 +171,6 @@

- Go term namespace added to the information extracted by `fetch_go`.

## Version 1.4.1

- Attempted to fix a bizarre error message on Bioconductor's test machines with older version of MacOS. Windows and Linux are not affected; my laptop running Sequoia 5.2 does not show show errors. I suspect a memory leak in older systems. The error `vector memory limit of 64.0 Gb reached, see mem.maxVSize()` happened in the function parse_kegg_genes(), a flat-file parser for KEGG. It occurred around the call tidyr::separate(), which I replaced with an alternative approach. Will see if the error is fixed.
24 changes: 14 additions & 10 deletions R/kegg.R
Original file line number Diff line number Diff line change
Expand Up @@ -96,16 +96,20 @@ parse_kegg_genes <- function(s) {
i <- i + 1
}

# create final tibble, attempt to extract gene symbols when semicolon is found
genes |>
tibble::as_tibble_col(column_name = "data") |>
tidyr::separate(data, c("gene_id", "gene_symbol"), sep = "\\s+", extra = "merge") |>
dplyr::mutate(gene_symbol = dplyr::if_else(
stringr::str_detect(gene_symbol, ";"),
stringr::str_remove(gene_symbol, ";.+$"),
gene_id
)) |>
tibble::add_column(term_id = pathway)
purrr::map(genes, function(gene) {
# First element - gene ID, second - gene symbol, if contains semicolon
v <- stringr::str_split_1(gene, "\\s+")
tibble::tibble(
gene_id = v[1],
gene_symbol = ifelse(
stringr::str_detect(v[2], ";"),
stringr::str_remove(v[2], ";.*$"),
v[1]
),
term_id = pathway
)
}) |>
purrr::list_rbind()
}) |>
purrr::list_rbind()
}
Expand Down

0 comments on commit c494606

Please sign in to comment.