Skip to content

Commit

Permalink
tricky (edge) case in dcast.data.table fixed. closes #715.
Browse files Browse the repository at this point in the history
  • Loading branch information
arunsrinivasan committed Jul 3, 2014
1 parent aff5621 commit b680d3d
Show file tree
Hide file tree
Showing 3 changed files with 13 additions and 4 deletions.
4 changes: 3 additions & 1 deletion R/fcast.R
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,8 @@ dcast.data.table <- function(data, formula, fun.aggregate = NULL, ..., margins =
fill.default <- NULL
if (!is.null(fun.aggregate)) { # construct the 'call'
fill.default = fun.aggregate(data[[value.var]][0], ...)
if (!length(fill.default) && (is.null(fill) || !length(fill)))
stop("Aggregating function provided to argument 'fun.aggregate' should always return a length 1 vector, but returns 0-length value for fun.aggregate(", typeof(data[[value.var]]), "(0)).", " This value will have to be used to fill missing combinations, if any, and therefore can not be of length 0. Either override by setting the 'fill' argument explicitly or modify your function to handle this case appropriately.")
args <- c("data", "formula", "margins", "subset", "fill", "value.var", "verbose", "drop")
m <- m[setdiff(names(m), args)]
.CASTfun = fun.aggregate # issues/713
Expand Down Expand Up @@ -107,7 +109,7 @@ dcast.data.table <- function(data, formula, fun.aggregate = NULL, ..., margins =
attr(oo, 'maxgrpn') > 1L
}
if (!fun.null && fun_agg_chk(data))
stop("Aggregating function provided to argument 'fun.aggregate' should return a length 1 vector for each group, but returns length != 1 for atleast one group. Please have a look at the DETAILS section of ?dcast.data.table ")
stop("Aggregating function provided to argument 'fun.aggregate' should always return a length 1 vector for each group, but returns length != 1 for atleast one group. Please have a look at the DETAILS section of ?dcast.data.table ")
} else {
if (is.null(subset))
data = data[, unique(c(ff_, value.var)), with=FALSE] # data is untouched so far. subset only required columns
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,8 @@ We moved from R-Forge to GitHub on 9 June 2014, including history.
34. `dcast.data.table` handles `fun.aggregate` argument properly when called from within a function that accepts `fun.aggregate` argument and passes to `dcast.data.table()`. Closes [#713](https://github.com/Rdatatable/data.table/issues/713). Thanks to mathematicalcoffee for reporting [here](http://stackoverflow.com/q/24542976/559784) on SO.
34. `dcast.data.table` now returns a friendly error when fun.aggregate value for missing combinations is 0-length, and 'fill' argument is not provided. Closes [#715](https://github.com/Rdatatable/data.table/issues/715)
#### NOTES
1. Reminder: using `rolltolast` still works but since v1.9.2 now issues the following warning:
Expand Down
11 changes: 8 additions & 3 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -4679,7 +4679,7 @@ test(1313.22, DT[, list(y=max(y, na.rm=TRUE)), by=x], DT[c(5,10)])

# bug git #693 - dcast.data.table error message improvement:
dt <- data.table(x=c(1,1), y=c(2,2), z = 3:4)
test(1314, dcast.data.table(dt, x ~ y, value.var="z", fun.aggregate=identity), error="Aggregating function provided to argument 'fun.aggregate' should return a length 1")
test(1314, dcast.data.table(dt, x ~ y, value.var="z", fun.aggregate=identity), error="Aggregating function provided to argument 'fun.aggregate' should always return a length 1 vector")

# bug #688 - preserving attributes
DT = data.table(id = c(1,1,2,2), ty = c("a","b","a","b"), da = as.Date("2014-06-20"))
Expand Down Expand Up @@ -4855,9 +4855,9 @@ test(1344, fread("A,B\n1,T\n2,NA\n3,"), data.table(A=1:3, B=c(TRUE,NA,NA)))
# issues/713 - dcast.data.table and fun.aggregate
DT <- data.table(id=rep(1:2, c(3,4)), k=c(rep(letters[1:3], 2), 'c'), v=1:7)
foo <- function (tbl, fun.aggregate) {
dcast.data.table(tbl, id ~ k, value.var='v', fun.aggregate=fun.aggregate)
dcast.data.table(tbl, id ~ k, value.var='v', fun.aggregate=fun.aggregate, fill=NA_integer_)
}
test(1345, foo(DT, last), dcast.data.table(DT, id ~ k, value.var='v', fun.aggregate=last))
test(1345, foo(DT, last), dcast.data.table(DT, id ~ k, value.var='v', fun.aggregate=last, fill=NA_integer_))

# more minor changes to dcast.data.table (subset argument handling symbol - removing any surprises with data.table's typical scoping rules) - test for that.
DT <- data.table(id=rep(1:2, c(3,4)), k=c(rep(letters[1:3], 2), 'c'), v=1:7)
Expand All @@ -4867,6 +4867,11 @@ test(1346.1, dcast.data.table(DT, id ~ k, value.var="v", subset=.(c(TRUE, rep(FA
DT[, bla := !bla]
test(1346.2, dcast.data.table(DT, id ~ k, value.var="v", subset=.(bla), fun.aggregate=length), dcast.data.table(DT[(bla)], id ~ k, value.var="v", fun.aggregate=length))

# issues/715
DT <- data.table(id=rep(1:2, c(3,2)), k=c(letters[1:3], letters[1:2]), v=1:5)
test(1347.1, dcast.data.table(DT, id ~ k, fun.aggregate=last, value.var="v"), error="Aggregating function provided to argument 'fun.aggregate' should always return a length 1 vector")
test(1347.2, dcast.data.table(DT, id ~ k, fun.aggregate=last, value.var="v", fill=NA_integer_), data.table(id=1:2, a=c(1L, 4L), b=c(2L,5L), c=c(3L,NA_integer_), key="id"))

##########################


Expand Down

0 comments on commit b680d3d

Please sign in to comment.