Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow.cartesian could allow .N on 2^31+ rows to finish #3009

Open
jangorecki opened this issue Aug 23, 2018 · 0 comments
Open

allow.cartesian could allow .N on 2^31+ rows to finish #3009

jangorecki opened this issue Aug 23, 2018 · 0 comments
Labels
joins Use label:"non-equi joins" for rolling, overlapping, and non-equi joins top request One of our most-requested issues

Comments

@jangorecki
Copy link
Member

jangorecki commented Aug 23, 2018

I am interested in count of rows from the join result

library(data.table)
set.seed(108)
n=2e5
d1=data.table(v1=1:n, v2=sample(5, replace=TRUE))
d2=data.table(v1=1:n, v2=sample(5, replace=TRUE))
d1[d2, .N, on="v2", allow.cartesian=TRUE]

optimization of .N actually makes the query lightweight because it does not need to allocate so many rows for the answer, thus in this particular case we could continue processing and not stop with

Error in vecseq(f__, len__, if (allow.cartesian || notjoin || !anyDuplicated(f__,  : 
  Join results in more than 2^31 rows (internal vecseq reached physical limit). Very likely misspecified join. Check for duplicate key values in i each of which join to the same group in x over and over again. If that's ok, try by=.EACHI to run j for each group to avoid the large allocation. Otherwise, please search for this error message in the FAQ, Wiki, Stack Overflow and data.table issue tracker for advice.
@jangorecki jangorecki added the joins Use label:"non-equi joins" for rolling, overlapping, and non-equi joins label Apr 5, 2020
@jangorecki jangorecki removed the High label Jun 3, 2020
@MichaelChirico MichaelChirico added the top request One of our most-requested issues label Apr 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
joins Use label:"non-equi joins" for rolling, overlapping, and non-equi joins top request One of our most-requested issues
Projects
None yet
Development

No branches or pull requests

2 participants