Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coerceAs applied to fill in dcast #4586

Merged
merged 39 commits into from
Mar 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
196f420
nafill rework for more robust coerce
jangorecki May 24, 2020
1fa2069
initial change for #4101
jangorecki May 24, 2020
22f98f7
nafill simple fill coerce
jangorecki May 25, 2020
7747c3d
nafill const works for locf and nocb as well, closes #3594
jangorecki May 25, 2020
cc2cc0e
fix tests for #4101
jangorecki May 25, 2020
4dfa307
coverage
jangorecki May 25, 2020
5c1fa6b
placeholder for nafill #3992
jangorecki May 25, 2020
7e8331d
use coerceAs in froll
jangorecki May 25, 2020
ff5aed2
enable disabled test
jangorecki May 25, 2020
0bf1564
nafill retain names, tests for fill list
jangorecki May 26, 2020
db00097
coerceAs gets copyArg so now can return its input
jangorecki May 26, 2020
d0ecba4
better control of verbose, and better find class for coerceAs
jangorecki May 26, 2020
be8347b
proper verbose to int
jangorecki May 27, 2020
b648661
verbose changes coverage
jangorecki May 27, 2020
008b37c
rm unused anymore
jangorecki May 27, 2020
07c9d2d
more precise verbose arg check
jangorecki May 27, 2020
9467ff5
memrecycle escape warnings and simple verbose for numcol==0
jangorecki May 27, 2020
1edae4f
coerceAs does not emit extra warning anymore
jangorecki May 27, 2020
71ab119
coerceAs verbose, more tests
jangorecki May 27, 2020
858d7a7
use older nanotime api
jangorecki May 27, 2020
a2f9adc
update error msg
jangorecki May 27, 2020
df758a6
coverage
jangorecki May 27, 2020
45e26db
codecov of coerceAs
jangorecki May 30, 2020
98f7cd3
catch all verbose
jangorecki May 30, 2020
834afba
Revert "initial change for #4101"
jangorecki May 30, 2020
f2ef980
Revert "fix tests for #4101"
jangorecki May 30, 2020
db95b08
use coerceAs in fcast.c
Jul 1, 2020
dc4382f
Merge branch 'master' into dcast-int64
MichaelChirico Feb 20, 2024
73af17e
restore actual fix
MichaelChirico Feb 20, 2024
0781e24
ws
MichaelChirico Feb 20, 2024
e4ae1d7
incomplete merge
MichaelChirico Feb 20, 2024
7167f18
vestigial bad merge
MichaelChirico Feb 20, 2024
e3e1e63
minimize diff
MichaelChirico Feb 20, 2024
ec95e99
coerceAs throws warning for string->double, and errors on list
MichaelChirico Feb 20, 2024
b9d739a
comment
MichaelChirico Feb 20, 2024
b8c53d9
example without warning
MichaelChirico Mar 1, 2024
377323c
Merge remote-tracking branch 'origin/master' into dcast-int64
MichaelChirico Mar 1, 2024
cf0ed8a
bad NEWS merge
MichaelChirico Mar 1, 2024
9d7b196
warning is still needed
MichaelChirico Mar 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,14 @@

3. `split.data.table` also accepts a formula for `f`, [#5392](https://github.com/Rdatatable/data.table/issues/5392), mirroring the same in `base::split.data.frame` since R 4.1.0 (May 2021). Thanks to @XiangyunHuang for the request, and @ben-schwen for the PR.

3. Namespace-qualifying `data.table::shift()`, `data.table::first()`, or `data.table::last()` will not deactivate GForce, [#5942](https://github.com/Rdatatable/data.table/issues/5942). Thanks @MichaelChirico for the proposal and fix. Namespace-qualifying other calls like `stats::sum()`, `base::prod()`, etc., continue to work as an escape valve to avoid GForce, e.g. to ensure S3 method dispatch.
4. Namespace-qualifying `data.table::shift()`, `data.table::first()`, or `data.table::last()` will not deactivate GForce, [#5942](https://github.com/Rdatatable/data.table/issues/5942). Thanks @MichaelChirico for the proposal and fix. Namespace-qualifying other calls like `stats::sum()`, `base::prod()`, etc., continue to work as an escape valve to avoid GForce, e.g. to ensure S3 method dispatch.

## BUG FIXES

1. `unique()` returns a copy the case when `nrows(x) <= 1` instead of a mutable alias, [#5932](https://github.com/Rdatatable/data.table/pull/5932). This is consistent with existing `unique()` behavior when the input has no duplicates but more than one row. Thanks to @brookslogan for the report and @dshemetov for the fix.

2. `dcast` handles coercion of `fill` to `integer64` correctly, [#4561](https://github.com/Rdatatable/data.table/issues/4561). Thanks to @emallickhossain for the bug report and @MichaelChirico for the fix.

## NOTES

1. `transform` method for data.table sped up substantially when creating new columns on large tables. Thanks to @OfekShilon for the report and PR. The implemented solution was proposed by @ColeMiller1.
Expand Down
12 changes: 10 additions & 2 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -15657,8 +15657,9 @@ test(2071.08, any_na(data.table(c(1+1i, NA))))
test(2071.09, any_na(data.table(expression(1))), error="Unsupported column type 'expression'")
test(2071.10, dcast(data.table(a=1, b=1, l=list(list(1))), a ~ b, value.var='l'),
data.table(a=1, `1`=list(list(1)), key='a'))
test(2071.11, dcast(data.table(a = 1, b = 2, c = 3), a ~ b, value.var = 'c', fill = '2'),
data.table(a=1, `2`=3, key='a'))
test(2071.11, dcast(data.table(a = c(1, 1, 2), b = c(1, 2, 1), c = 3), a ~ b, value.var='c', fill='4'),
data.table(a=c(1, 2), `1`=c(3, 3), `2`=c(3, 4), key='a'),
warning="Coercing 'character' RHS to 'double'")

# fifelse, #3657
test_vec = -5L:5L < 0L
Expand Down Expand Up @@ -18304,3 +18305,10 @@ test(2247.2, split(dt, ~y), list(`1`=data.table(x=c(1L,3L), y=1L), `2`=data.tabl
test(2247.3, do.call(rbind, split(dt, ~y)), setDT(do.call(rbind, split(as.data.frame(dt), ~y))))
dt = data.table(x=1:4, y=factor(letters[1:2]), z=factor(c(1,1,2,2), labels=c("c", "d")))
test(2247.4, split(dt, ~y+z), list("a.c"=dt[1], "b.c"=dt[2], "a.d"=dt[3], "b.d"=dt[4]))

# fill converts to int64 in dcast, #4561
if (test_bit64) {
i64v = as.integer64(c(12345678901234, 70, 20, NA))
apple = data.table(id = c("a", "b", "b"), time = c(1L, 1L, 2L), y = i64v[1:3])
test(2248, dcast(apple, id ~ time, value.var = "y"), data.table(id = c('a', 'b'), `1` = i64v[1:2], `2` = i64v[4:3], key='id'))
}
10 changes: 5 additions & 5 deletions src/fcast.c
Original file line number Diff line number Diff line change
Expand Up @@ -16,17 +16,17 @@ SEXP fcast(SEXP lhs, SEXP val, SEXP nrowArg, SEXP ncolArg, SEXP idxArg, SEXP fil
}
// get val cols
for (int i=0; i<nval; ++i) {
SEXP thiscol = VECTOR_ELT(val, i);
const SEXP thiscol = VECTOR_ELT(val, i);
SEXP thisfill = fill;
SEXPTYPE thistype = TYPEOF(thiscol);
const SEXPTYPE thistype = TYPEOF(thiscol);
int nprotect = 0;
if (isNull(fill)) {
if (LOGICAL(is_agg)[0]) {
thisfill = PROTECT(allocNAVector(thistype, 1)); nprotect++;
} else thisfill = VECTOR_ELT(fill_d, i);
}
if (TYPEOF(thisfill) != thistype) {
thisfill = PROTECT(coerceVector(thisfill, thistype)); nprotect++;
if (isVectorAtomic(thiscol)) { // defer error handling to below, but also skip on list
thisfill = PROTECT(coerceAs(thisfill, thiscol, /*copyArg=*/ScalarLogical(false))); nprotect++;
}
switch (thistype) {
case INTSXP:
Expand Down Expand Up @@ -89,7 +89,7 @@ SEXP fcast(SEXP lhs, SEXP val, SEXP nrowArg, SEXP ncolArg, SEXP idxArg, SEXP fil
}
}
break;
default: error(_("Unsupported column type in fcast val: '%s'"), type2char(TYPEOF(thiscol))); // #nocov
default: error(_("Unsupported column type in fcast val: '%s'"), type2char(thistype)); // #nocov
}
UNPROTECT(nprotect);
}
Expand Down
Loading