-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected result for max of character variable by group #5331
Comments
thanks, this is a great instructive example. the difference is whether data.table or base R does the sorting for you. data.table always sorts in C locale; base sorts in system locale by default (and I'm not sure there an option to toggle with max() to change this on the fly) you can look at verbose=TRUE in your examples it should be illustrative. what's your desired outcome? the workaround depends on what you expected. |
Thanks for the explanation @MichaelChirico. I don't have a specific desired outcome for whether "alice" or "Bob" is considered to be the maximum. Consistency with base R would be nice, but I can accept that there is more than one reasonable approach. When I encountered something like this, what I found really confusing was |
agreed there. we have some plan to apply GForce more consistently. the current issue is that one we see an ad-hoc expression, it turns off GForce for the entire query. understand this can be confusing and is basic exposing an implementation detail. for now your best bet is to remember trying verbose=TRUE to get some insight whenever encountering something like this. |
if you want consistency, I believe you can set Sys.setenv(LC_ALL="C") |
Thanks for your replies @MichaelChirico. I had tried verbose=TRUE but still wasn't sure about the reason until I read your explanation. |
I was surprised by this:
Why are
DT2$m1
andDT3$m1
different?And why is
DT2[group == "g1", m1]
not the same asDT[group == "g1", max(x)]
?Thanks.
The text was updated successfully, but these errors were encountered: