Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another round of code deletion #38

Merged
merged 2 commits into from
Dec 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Maintainer: Michael Chirico <MichaelChirico4@gmail.com>
Depends: R (>= 3.2.2)
Description: YACFP (Yet Another Convenience Function Package). get_age() is a fast & accurate tool for measuring fractional years between two dates. stale_package_check() tries to identify any library() calls to unused packages.
Imports: data.table
Suggests: testthat (>= 3.0.0), jsonlite
Suggests: testthat (>= 3.0.0)
Config/testthat/edition: 3
License: MIT + file LICENSE
URL: https://github.com/MichaelChirico/funchir
Expand Down
12 changes: 3 additions & 9 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,14 +1,8 @@
export("%^%", "%u%", "%\\%")
export("%\\%")
export(tile.axes, xdev2in, ydev2in, xydev2in)
export(create_quantiles, to.pct,
nx.mlt, divide, dol.form,
ntostr, embed.mat)
export(embed.mat)
export(get_age, quick_year, quick_yday, quick_mday)
export(write.packages, stale_package_check)
export(stale_package_check)

importFrom(stats, quantile, setNames)

importFrom(utils, sessionInfo)
importFrom(graphics, par)

importFrom(data.table, fcase)
14 changes: 9 additions & 5 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
### v0.2.99

* `get_age()` gives the right age in some cases, e.g. 7 1/366 years for someone born Dec. 22, 2024 as of Dec. 23, 2031 (#23). Accuracy is now confirmed for a full grid of >2 million possible birthday, age combinations.
* `get_age()` implementation is improved for about 2x speed-up. This was in service of making the implementation friendlier for static translation to other execution engines (in particular {arrow}, #18). Thanks @TPDeramus for the request and @jonkeane for consulting on acero particulars.
* `get_age()` doesn't require its input to be `Date` as long as `as.Date()` succeeds, for convenience in quick examples like `get_age('2003-02-04', '2008-12-30')`.
* `get_age()` supports recycling of one length-1 input and handles missing values in either argument.
* Delete some long-unused functions: `abbr_to_colClasses()`, `sanitize2()`. I haven't used these myself in a very long time, and I don't see any other users on GitHub either.
* I removed most of the functionality from the package. I think only `stale_package_check()` and `get_age()` have gotten any real downstream usage aside from my own; the few functions I kept are those that I actually continue to use myself with any regularity, or which I think it's worth having written down to save re-inventing the wheel for functions only needed rarely.

I tried looking around GitHub for existing users, but if this affects you, please don't hesitate to file a FR and I can restore anything deleted too hastily.

* `get_age()` got some substantial improvements.
+ Gives the right age in some cases, e.g. 7 1/366 years for someone born Dec. 22, 2024 as of Dec. 23, 2031 (#23). Accuracy is now confirmed for a full grid of >2 million possible birthday, age combinations.
+ Implementation is improved for about 2x speed-up. This was in service of making the implementation friendlier for static translation to other execution engines (in particular {arrow}, #18). Thanks @TPDeramus for the request and @jonkeane for consulting on acero particulars.
+ Input not required to be `Date` as long as `as.Date()` succeeds, for convenience in quick examples like `get_age('2003-02-04', '2008-12-30')`.
+ Supports recycling of one length-1 input and handles missing values in either argument.

### v0.2.2

Expand Down
98 changes: 0 additions & 98 deletions R/utils.R
Original file line number Diff line number Diff line change
@@ -1,95 +1,3 @@
# Specific wrapper of cut to create a factor of quantiles of a vector
create_quantiles <- function(x, num, right = FALSE, na.rm = FALSE,
include.lowest = TRUE, labels = 1:num) {
uniq_Qs = unique(quantile(x, probs = 0:num/num, na.rm = na.rm))
if (length(uniq_Qs) - 1L != length(labels) && !is.null(labels)) {
stop(
sprintf(
ngettext(
length(uniq_Qs) - 1L, domain="R-funchir",
'Overlapping quantiles. Please provide %d label.',
'Overlapping quantiles. Please provide %d labels.'
),
length(uniq_Qs) - 1L
),
domain=NULL
)
}
cut(x, breaks = uniq_Qs, labels = labels, right = right,
include.lowest = include.lowest)
}

# Inline conversion to percentage
to.pct <- function(x, dig = Inf) round(100.0 * x, digits = dig)

# Get the nearest multiple of n weakly larger than x
nx.mlt <- function(x, n) n * ceiling(x / n)

# Create a linear progression along the range of x with n points
divide = function(x, n, na.rm = FALSE) {
r = range(x, na.rm = na.rm)
seq(r[1L], r[2L], length.out = n)
}

# Convert numbers for printing to dollar format
dol.form <- function(x, dig = 0L, suff = "", tex = FALSE) {
neg <- rep("", length(x))
neg[x < 0.0] <- "-"
div <- c(1.0, k=1.0e3, m=1.0e6, b=1.0e9)
idx <- which(names(div) == suff)
paste0(neg, if (tex) "\\", "$",
prettyNum(round(abs(x)/div[idx], digits = dig),
big.mark = ",", scientific = FALSE), suff)
}

# Convert numbers to strings OF SPECIFIED LENGTH
# Convenient for getting c("99","00") from 99:100
ntostr <- function(n, dig = 2L) {
sprintf(sprintf("%%0%dd", dig), n %% 10L^dig)
}

# Write the output of sessionInfo() & the date to a file
# (for tracking package versions over time)
write.packages <- function(con = stdout()) {
# nocov start
if (!requireNamespace('jsonlite', quietly = TRUE)) {
stop('jsonlite is required for this functionality, please install first')
}
# nocov end
si = sessionInfo()
desc_fields = c('Version', 'Depends', 'Imports', 'Suggests',
'License', 'URL', 'Packaged', 'Built')
desc_pad = function(desc) `names<-`(desc[desc_fields], desc_fields)
locale_list = function() {
# query these individually given proviso in ?Sys.getlocale:
# > For portability, it is best to query categories individually...
# this list of categories taken from the output of my Linux Mint machine
lc_names = c('LC_CTYPE', 'LC_NUMERIC', 'LC_TIME', 'LC_COLLATE',
'LC_MONETARY', 'LC_MESSAGES', 'LC_PAPER',
'LC_NAME', 'LC_ADDRESS', 'LC_TELEPHONE',
'LC_MEASUREMENT', 'LC_IDENTIFICATION')
lapply(setNames(nm=lc_names), function(x) {
tryCatch(Sys.getlocale(x), error = function(e) {
if (e$message == "invalid 'category' argument") '' else stop(e) # nocov
})
})
}
out = list(
r_version = list(platform = si$platform,
version.string = si$R.version$version.string),
locale = locale_list(),
running = si$running,
linear_algebra = list(matrix_products = si$matprod,
blas = si$BLAS, lapack= si$LAPACK),
base_packages = si$basePkgs,
other_packages = lapply(si$otherPkgs, desc_pad),
loaded_via_namespace = lapply(si$loadedOnly, desc_pad),
write_package_time = format(Sys.time(), tz = 'UTC', usetz = TRUE)
)
writeLines(jsonlite::toJSON(out, pretty = TRUE, auto_unbox = TRUE), con = con)
return(invisible(out))
}

# Embed the matrix mat in a larger matrix by
# placing the top-left element of mat at the supplied
# position (m,n).
Expand Down Expand Up @@ -253,9 +161,3 @@ quick_mday = function(dates) .mday1461__[1L + unclass(dates) %% 1461L]
### Set difference
### *Note: only need one backslash for use
`%\\%` <- function(A, B) setdiff(A, B)

### Set union
`%u%` <- function(A, B) union(A, B)

### Set intersection
`%^%` <- function(A, B) intersect(A, B)
17 changes: 3 additions & 14 deletions man/funchir-infix.Rd
Original file line number Diff line number Diff line change
Expand Up @@ -2,32 +2,21 @@
\name{funchir-infix}
\alias{\%\\\%}
\alias{\%<unescaped bksl>\%}
\alias{\%u\%}
\alias{\%^\%}
\description{
Several infix operators which are convenient shorthand for common set operations, namely, \emph{modulation} (A\\B), \emph{union} (AUB) and \emph{intersection} (A & B).
An infix operator as convenient shorthand for set modulation (A\\B)
}
\usage{
A \%\\\% B
A \%u\% B
A \%^\% B
}
\arguments{
\item{A}{ A set \code{A}. }
\item{B}{ \emph{idem} \code{A}. }
\item{A,B}{ Objects which can be treated as sets. }
}
\value{
The above are simply wrappers for the base functions \code{setdiff}, \code{union}, and \code{intersect}, respectively, so output is exactly as for those functions.
}

\seealso{
\code{\link{setdiff}} , \code{\link{union}}, \code{\link{intersect}}
This is just a wrapper for \code{\link{setdiff}}
}
\examples{
set1 <- 1:5
set2 <- 4:6

set1 \%\\\% set2 # c(1,2,3)
set1 \%u\% set2 # c(1,2,3,4,5,6)
set1 \%^\% set2 # c(4,5)
}
64 changes: 1 addition & 63 deletions man/funchir-utils.Rd
Original file line number Diff line number Diff line change
@@ -1,13 +1,5 @@
\name{funchir-utils}
\alias{create_quantiles}
\alias{to.pct}
\alias{nx.mlt}
\alias{divide}
\alias{dol.form}
\alias{ntostr}
\alias{write.packages}
\alias{stale_package_check}
\alias{clean_slate}
\alias{embed.mat}
\alias{quick_year}
\alias{quick_mday}
Expand All @@ -17,84 +9,30 @@
Several odds-and-ends functions for data manipulation & representation, etc. See details and examples.
}
\usage{
create_quantiles(x, num, right = FALSE, na.rm = FALSE,
include.lowest = TRUE, labels = 1:num)
to.pct(x, dig = Inf)
nx.mlt(x, n)
divide(x, n, na.rm = FALSE)
dol.form(x, dig = 0L, suff = "", tex = FALSE)
ntostr(n, dig = 2L)
write.packages(con)
stale_package_check(con)
embed.mat(mat, M = nrow(mat), N = ncol(mat), m = 1L, n = 1L, fill = 0L)
quick_year(dates)
quick_mday(dates)
quick_yday(dates)
}
\arguments{
\item{x}{ A numeric vector. }
\item{num}{ A number, typically an integer, specifying how many equal-count intervals into which to divide the data.}
\item{right}{ logical, indicating if the intervals should be closed on the right (and open on the left) or vice versa. }
\item{na.rm}{ \code{logical} passed to \code{\link{quantile}} with the usual interpretation. }
\item{include.lowest}{ logical, indicating if an \code{x[i]} equal to the lowest (or highest, for \code{right = FALSE}) \code{breaks} value should be included.}
\item{labels}{ \code{character} vector of length \code{num}; the labels to be applied to the resulting \code{factor}. }
\item{dig}{ The number of digits to be included past the decimal in output; sent directly to \code{round}. }
\item{suff}{ The suffix to appended/unit in which to express \code{x}. Currently one of \code{c("", "k", "m", "b")}, corresponding to plain units, thousands, millions, and billions, respectively. }
\item{tex}{ Should \code{$} be printed as \code{\\$} for direct copy-pasting to TeX files? }
\item{n}{ For \code{nx.mlt}, \code{divide} and \code{ntostr}, a number; see details. For \code{embed.mat}, an integer specifying the column at which to insert \code{mat}. }
\item{con}{ A file/connection where output should be written. }
\item{mat}{ A matrix. }
\item{M}{ An integer specifying the number of rows in the enclosing matrix. }
\item{N}{ An integer specifying the number of columns in the enclosing matrix. }
\item{m}{ An integer specifying the row at which to insert \code{mat}. }
\item{n}{ An integer specifying the column at which to insert \code{mat}. }
\item{fill}{ An atomic vector specifying how to fill the enclosing matrix. }
\item{dates}{ A vector of \code{Date}s. }
}
\value{
\code{create_quantiles} is a parsimonious function for generating quantiles of a vector (e.g., quartiles for \code{num=4} or quintiles for \code{num=5}). Basically a wrapper for the \code{cut} function; the type of the output is \code{factor}. Fails for vectors with overlapping quantiles (e.g., with >50\% of values of \code{x} equal to zero) unless the correct number of labels (i.e., the number of unique quantile breaks) is given in the \code{labels} argument.

\code{to.pct} converts a number (probably a proportion, i.e., typically between 0 and 1) to a percentage; also has an argument (\code{dig}) which can be used to round the output inline.

\code{nx.mlt} returns the least multiple of \code{n} which (weakly) exceeds \code{x}. Convenient for making axes ticks land on pretty numbers.

\code{divide} divides the range (min through max) of \code{x} into \code{n} points (basically a shorthand for \code{seq}).

\code{dol.form} takes a financial input and converts it to a (American-formatted, American-currency) string for printing--appending a dollar sign (\code{"\$"}) and inserting commas after every third digit from the left of the decimal point.

\code{ntostr} converts \code{n} to a \code{character} vector with each element width \code{dig}. This is particularly nice for converting 99:100 to "99" and "100".

\code{write.packages} captures the current package environment (inspired by \code{sessionInfo()} and writes it as a JSON to \code{con} with \code{\link{writeLines}}; a \code{list} version of this object is returned. This may be essential for tracking across time which package versions were being used.

\code{stale_package_check} reads a file (with \code{\link{readLines}}) and checks which functions are actually used from each loaded package. Currently only checks for \code{library} (i.e., not \code{require}) calls.

\code{embed.mat} inserts a supplied matrix into a (weakly) larger enclosing matrix, typically filled with 0s, at a specified position.

\code{quick_year} converts a \code{Date} object into its year efficiently; also ignores concerns of leap centuries. \code{quick_mday} returns the day of the month. \code{quick_yday} returns the day of the year. Returns as an \code{integer}.
}

\seealso{
\code{\link{cut}}, \code{\link{prettyNum}}
}
\examples{
x <- runif(100)

# Return which multiple of 1/7 least
# exceeds each element of x
create_quantiles(x, 7)

to.pct(x)
to.pct(x, dig = 2) #output of the form xxx.xx

nx.mlt(x, 1/3)

dol.form(x, dig=2L)

ntostr(999:1000, dig = 3L) # c("999","000")
ntostr(999:1000, dig = 2L) # c("99","00")

library(stats)
write.packages()

inmat <- matrix(1:9, ncol = 3L)
embed.mat(inmat, M = 4L, N = 4L)
embed.mat(inmat, N = 6L, n = 4L, fill = NA)
Expand Down
41 changes: 0 additions & 41 deletions tests/testthat/test-utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -33,24 +33,6 @@ test_that('stale_package_check works', {
)
})

test_that('one-line utilities work', {
expect_identical(to.pct(0.8, 2.0), 80.0)
expect_identical(to.pct(0.8030432, 3.0), 80.304)

expect_identical(nx.mlt(3.0, 5.0), 5.0)
expect_identical(nx.mlt(24.0, 17.0), 34.0)

expect_identical(divide(c(1.0, 4.0, 8.0, 9.0, 11.0, 2.0, 2.0), 3L), c(1.0, 6.0, 11.0))

expect_identical(dol.form(1.0e6), '$1,000,000')
expect_identical(dol.form(1.0e6, suff='m'), '$1m')
expect_identical(dol.form(-1.0e6), '-$1,000,000')
expect_identical(dol.form(123.456, dig = 0L), '$123')
expect_identical(dol.form(123.0, tex = TRUE), '\\$123')

expect_identical(ntostr(1999:2020, 2L), sprintf('%02d', c(99L, 0:20)))
})

test_that('embed.mat works', {
m = matrix(1:10, 5L, 2L)
expect_identical(embed.mat(m, 6L, 3L), rbind(cbind(m, 0L), 0L))
Expand All @@ -64,9 +46,7 @@ test_that('embed.mat works', {
test_that('set utilities work', {
A = 1:5
B = 3:8
expect_identical(A %u% B, 1:8)
expect_identical(A %\% B, 1:2)
expect_identical(A %^% B, 3:5)
})

# works as long as range(dates) doesn't include leap centuries:
Expand Down Expand Up @@ -160,24 +140,3 @@ test_that('get_age works', {
expect_identical(get_age(numeric(), numeric()), numeric())
expect_error(get_age(numeric(3L), numeric(4L)), "must have equal length")
})

test_that('create_quantiles works', {
expect_identical(create_quantiles(1:10, 4L), factor(rep(1:4, c(3L, 2L, 2L, 3L))))
expect_identical(create_quantiles(1:10, 3L), factor(rep(1:3, c(3L, 3L, 4L))))

expect_error(create_quantiles(rep(1:2, c(100L, 1L)), 2L), 'Overlapping quantiles. Please provide 1 label.')
expect_error(create_quantiles(rep(1:2, c(2L, 1L)), 3L), 'Overlapping quantiles. Please provide 2 labels.')
})

test_that('write_packages works', {
invisible(capture.output({
out <- write.packages()
}))

expect_named(out, c(
"r_version", "locale", "running", "linear_algebra", "base_packages",
"other_packages", "loaded_via_namespace", "write_package_time"
))

expect_identical(out$r_version$version.string, R.version.string)
})
Loading