-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault during sorting #2707
Comments
Thanks for reporting. Any chance of a way to reproduce that dataset? It might be specific to the column type and query. I can see the query from the output, but what is |
The table seems contain strings. Is it possible that it's related to the encoding issue that I've reported? |
Anything is possible. @KyleDKavanagh - it's much harder (and perhaps impossible) if you make us guess. My guess is that it's to do with the |
Apologies for the delayed response - Both query and grpvar are strings. Unfortunately, I can't provide the exact dataset I've been working on as it's proprietary data for my employer. |
Ok no problem. Can you provide |
My guess is that one of "1" or "2" isn't in the data. The fact that
Hopefully @KyleDKavanagh can provide some verbose output similar to the above. But in the meantime this has to be fixed anyway. Kyle, since you can't provide the data, you could type |
Everything below run on v1.10.4-3 after rolling back
Results from str()
Results from uniqueN()
Values for query:
fsort debug - Not sure how helpful this is because of the number of rows
|
Thanks! That's odd as I didn't think v1.10.4-3 would call |
The query contains |
Artifact of obfuscating by hand. Confirmed that it's not picking up any wider-scoped variables. |
Have been looking through, and there is a call to
In a fresh R session, load data.table dev 1.10.5 and then do:
Does it crash? If anyNA(x) is TRUE then that's fine and I know why. If not, then could you send me that file? It's just a vector of row numbers; nothing proprietary. If integers, apx 5MB. If double, apx 10MB. So try email first, otherwise please place online somewhere, or attach here in GitHub might work. Thanks. |
Interestingly, the segfault only seems to occur when setDTthreads=0 (Rather than the default of 16). Four tests, 2 with unlimited threads (both crashed), 2 with 16 threads (both worked). X was confirmed to be identical for all four tests x.Rdata attached as a tgz
|
Perfect! Many thanks. I haven't seen the crash locally yet, but the info that it depends on nThreads helps a lot as that determines the chunk size. Could you run those variations with |
@KyleDKavanagh Could you retry latest dev please, now that PR is merged. I'd put the chance that fixes it at 70%. |
No luck... |
@KyleDKavanagh Ok. Can you provide the verbose=TRUE output please (see above). And your |
Did manage to reproduce a segfault in this area and fixed it 12 days ago, but Kyle reported it didn't work for them. Haven't heard from Kyle in 11 days and I haven't got the information I asked for above. Other memory faults recently fixed in dev could have been at play. |
With v1.10.5 installed from source, newly seeing a segfault during standard grpby/assign operations on massive datatables
The text was updated successfully, but these errors were encountered: