Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve := error for case when cedta() dispatches [.data.frame #5721

Closed
MichaelChirico opened this issue Nov 3, 2023 · 7 comments · Fixed by #6019
Closed

Improve := error for case when cedta() dispatches [.data.frame #5721

MichaelChirico opened this issue Nov 3, 2023 · 7 comments · Fixed by #6019
Assignees

Comments

@MichaelChirico
Copy link
Member

The current error is not very helpful for a package author that forgot to @import data.table leading [.data.frame to dispatch as cedta() gives FALSE. So := is used correctly, but still get the error.

Difficult (not impossible) to encounter once the package reaches CRAN caliber, more likely for package under early development.

Alternatively, using := is a pretty good sign the author intended [.data.table, so we could consider changing the cedta() result, but I think that will be non-trivial.

@tdhock
Copy link
Member

tdhock commented Nov 3, 2023

What is the current error, and what do you suggest we change it to?

@MichaelChirico
Copy link
Member Author

What is the current error,

The normal error for when := is invoked directly:

stopf('Check that is.data.table(DT) == TRUE. Otherwise, :=, `:=`(...) and let(...) are defined for use in j, once only and in particular ways. See help(":=").')

what do you suggest we change it to?

Not sure yet, but the current message is misleading in this case (emphasis on "ensure x is a data.table", but x is in fact a data.table!).

@Nj221102
Copy link
Contributor

Nj221102 commented Mar 20, 2024

Is this error message good enough to fix this issue ?

"It seems that := is being used directly. Make sure that data.table is imported and used explicitly in your package. If you are intentionally using := with a data table, ensure that the object being modified is indeed a data table. If you are encountering this error unexpectedly, it may be due to data.table methods being dispatched as data.frame methods. Check that is.data.table(DT) returns TRUE for your object DT. For more information on using := and data tables, refer to the documentation by running ?:=``."

@Nj221102 Nj221102 self-assigned this Mar 20, 2024
@MichaelChirico
Copy link
Member Author

MichaelChirico commented Mar 20, 2024

imported and used explicitly in your package

But this error might not come up in a package, or it might be caused by an issue in another package entirely.

I am thinking the best way to address this may be in [.data.table itself, e.g. in this branch:

data.table/R/data.table.R

Lines 146 to 155 in 3eefbca

if (!cedta()) {
# Fix for #500 (to do)
Nargs = nargs() - (!missing(drop))
ans = if (Nargs<3L) { `[.data.frame`(x,i) } # drop ignored anyway by DF[i]
else if (missing(drop)) `[.data.frame`(x,i,j)
else `[.data.frame`(x,i,j,drop)
# added is.data.table(ans) check to fix bug #81
if (!missing(i) && is.data.table(ans)) setkey(ans, NULL) # drops index too; tested by plyr::arrange test in other.Rraw
return(ans)
}

Quickly scan j for := (and let), and throw a specific error if needed along the lines "[ was called on a data.table in an environment that is not data.table-aware (i.e. cedta()), but :=` was used, implying the owner of this call really intended for data.table methods to be called. See vignette("datatable-importing") for details on properly importing data.table".

OTOH, maybe using := in j is a sign that cedta() should pass. If caller is using :=, it should be pretty clear they are data.table-aware. WDYT @jangorecki?

@jangorecki
Copy link
Member

jangorecki commented Mar 21, 2024

Isn't / wasn't := operator used in some tidyverse package as well?
I think they used it to substitute LHS of assignment because := is lazy on LHS and RHS, unlike = where you cannot catch LHS. I don't follow API changes in tidyverse, and I think that may have been changed already, because there are better ways to achieve (like we did inside env var).

@TysonStanley
Copy link
Member

Isn't / wasn't := operator used in some tidyverse package as well?

Yes, it's used within their mutate() function for lazy evaluation. Pretty sure it's still implemented.

@MichaelChirico
Copy link
Member Author

I think they used it to substitute LHS of assignment because := is lazy on LHS and RHS, unlike = where you cannot catch LHS.

yes that's still surely used where needed. my suggestion is to look for := as root of j though, in same fashion as later in [.data.table, instead of recursive search which might find tidyverse false positives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants