Skip to content

Commit

Permalink
Closes #696, #478, #703.
Browse files Browse the repository at this point in the history
  • Loading branch information
arunsrinivasan committed Jun 23, 2014
1 parent 6056698 commit e9104d7
Show file tree
Hide file tree
Showing 4 changed files with 30 additions and 2 deletions.
6 changes: 4 additions & 2 deletions R/setkey.R
Original file line number Diff line number Diff line change
Expand Up @@ -173,8 +173,10 @@ forder = function(x, ..., na.last=TRUE, decreasing=FALSE)
ans <- point(ans, i, eval(v, x, parent.frame()), 1L)
}
} else {
v = as.call(list(as.name("list"), v))
ans <- point(ans, i, eval(v, x, parent.frame()), 1L) # eval has to make a copy here (not due to list(.), but due to ex: "4-5*y"), unavoidable.
if (!is.object(eval(v, x, parent.frame()))) {
v = as.call(list(as.name("list"), v))
ans = point(ans, i, eval(v, x, parent.frame()), 1L) # eval has to make a copy here (not due to list(.), but due to ex: "4-5*y"), unavoidable.
} else ans = point(ans, i, list(unlist(eval(v, x, parent.frame()))), 1L)
} # else stop("Column arguments to order by in 'forder' should be of type name/symbol (ex: quote(x)) or call (ex: quote(-x), quote(x+5*y))")
}
}
Expand Down
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,8 @@ DT[, list(.N, mean(y), sum(y)), by=x] # 1.9.3+ - will use GForce.

* A join of the form `X[Y, roll=TRUE, nomatch=0L]` where some of Y's key columns occur more than once (duplicated keys) might at times return incorrect join. This was introduced only in 1.9.2 and is fixed now. Closes [#700](https://github.com/Rdatatable/data.table/issues/472). Thanks to Michael Smith for the very nice reproducible example and nice spotting of such a tricky case.

* Fixed an edge case in `DT[order(.)]` internal optimisation to be consistent with base. Closes [#696](https://github.com/Rdatatable/data.table/issues/696). Thanks to Michael Smith for reporting.

#### NOTES

* Reminder: using `rolltolast` still works but since v1.9.2 now issues the following warning :
Expand Down Expand Up @@ -251,6 +253,9 @@ DT[, list(.N, mean(y), sum(y)), by=x] # 1.9.3+ - will use GForce.

* `dcast.data.table(dt, a ~ ... + b)` now generates the column names with values from `b` coming last. Closes # 5675.

* Added `x[order(.)]` internal optimisation, and how to go back to `base:::order(.)` if one wants to sort by session locale to
`?setorder` (with alias `?order` and `?forder`). Closes #5613 ([#478](https://github.com/Rdatatable/data.table/issues/478)) and
also [#703](https://github.com/Rdatatable/data.table/issues/703). Thanks to Christian Wolf for the report.

### Changes in v1.9.2 (on CRAN 27 Feb 2014)

Expand Down
13 changes: 13 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -4710,6 +4710,19 @@ test(1318.1, DT[, eval(sumExpr), by = aa], DT[, sum(bb, na.rm=TRUE), by=aa])
test(1318.2, DT[, eval(meanExpr), by = aa], DT[, mean(bb, na.rm=TRUE), by=aa])
test(1318.3, DT[, list(mySum = eval(sumExpr), myMean = eval(meanExpr)), by = aa], DT[, list(mySum=sum(bb, na.rm=TRUE), myMean=mean(bb, na.rm=TRUE)), by=aa])

# get DT[order(.)] to be 100% consistent with base, even though the way base does some things is *utterly ridiculous*, inconsistent.
# closes #696.
DT <- data.table(a = 1:4, b = 8:5, c=letters[4:1])
test(1319.1, DT[order(DT[, "b", with=FALSE])], DT[base:::order(DT[, "b", with=FALSE])])
test(1319.2, DT[order(DT[, "c", with=FALSE])], DT[base:::order(DT[, "c", with=FALSE])])
test(1319.3, DT[order(DT[, c("b","c"), with=FALSE])], DT[base:::order(DT[, c("b","c"), with=FALSE])])
test(1319.4, DT[order(DT[, c("c","b"), with=FALSE])], DT[base:::order(DT[, c("c","b"), with=FALSE])])
test(1319.5, DT[order(DT[, "b", with=FALSE], DT[, "a", with=FALSE])], DT[base:::order(DT[, "b", with=FALSE], DT[, "a", with=FALSE])])
# test to make sure old things are not modified (ridiculous, but "consistency" demands it!)
test(1319.6, DT[order(list(DT$a))], DT[1])
test(1319.7, DT[order(list(DT$a), list(DT$b))], DT[1])
test(1319.8, DT[order(list(DT$a, DT$b))], error="Column '1' is type 'list' which is not")

##########################

# TO DO: Add test for fixed bug #5519 - dcast.data.table returned error when a package imported data.table, but dint happen when "depends" on data.table. This is fixed (commit 1263 v1.9.3), but not sure how to add test.
Expand Down
8 changes: 8 additions & 0 deletions man/setorder.Rd
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@
\alias{setorder}
\alias{setorderv}
\alias{setcolorder}
\alias{order}
\alias{fastorder}
\alias{forder}

\title{Fast reordering of a data.table by reference}
\description{
Note that in \code{data.table} parlance, all \code{set*} functions change their input \emph{by reference}. That is, no copy is made at all, other than temporary working memory, which is as large as one column.. The only other \code{data.table} operator that modifies input by reference is \code{\link{:=}}. Check out the \code{See Also} section below for other \code{set*} function \code{data.table} provides.
Expand All @@ -10,11 +14,15 @@

\code{setorder()} sorts or rearranges the rows of a \code{data.table} \emph{by reference}, based on the columns provided. It can sort in both ascending and descending order. The functionality is identical to using \code{?order} on a \code{data.frame}, except that \code{setorder} is much faster, is very memory efficient and is much more user-friendly.

\code{x[order(.)]} is now optimised internally to use \code{data.table}'s fast order by default. \code{data.table} by default always sorts in C-locale. If instead, it is essential to sort by the session locale, one could always revert back to base's \code{order} by doing: \code{x[base:::order(.)]}.
}

\usage{
setcolorder(x, neworder)
setorder(x, ...)
setorderv(x, cols, order=1L)
# optimised to use data.table's internal fast order
# x[order(.)]
}
\arguments{
\item{x}{ A \code{data.table}. }
Expand Down

0 comments on commit e9104d7

Please sign in to comment.