-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RStudio and R crashes with fatal error linked to data.table operation #2672
Comments
Couple of queries/observations:
|
have you reproduced this on the command line?
…On Wed, Mar 14, 2018, 4:01 AM mrmanojrai ***@***.***> wrote:
Couple of queries/observations:
1. Do you get crash with sample data provided with this issue?
2. Any specific reason to use as.Date when lubridate has already been
loaded and used?
3. Most of the j sections operations could have been grouped and
executed in one go as by argument is same as (by = eval(groupvector)).
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2672 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHQQdf2-fICsJM4TqLsejZbO8n9a16BVks5teCWugaJpZM4SpSht>
.
|
@mrmanojrai in response to your questions:
@MichaelChirico not sure what you mean viz. command line - do you mean what happens if I run this in base R rather than RStudio? I will try this and let you know the outcome. |
Update: base R crashes with the same error details:
|
Update 2:
|
Update 3: On further investigation, dumping the summarized files to .csv I was able to determine that the problem was not in summarizing the data, but rather with the call to The issue was fixed in data.table 1.10.5 (development version) and I'm happy to report that after upgrading to 1.10.5 my real function runs on my real data and produces the desired output without crashing R. Although I was unable to reproduce the problem with a MWE, I think that is just because the MWE didn't sufficiently reflect the complexity of my real data set. I will close the issue. |
I am attempting to summarize some variables cumulatively by group for each week in which there was new activity in that group, with a data.table line listing as input. This process works fine with a toy version of the function and a small input data set; however with larger data sets and the real (longer) function, R crashes with a fatal error. The details of the crash are here:
Session info is here:
Here is some example data:
Here is a toy version of the function:
And dependent function isoyrwk:
This is how I am applying the sumclusters function to my data:
mytest <- sapply(idlist, sumclusters, data = mydt, simplify = FALSE, USE.NAMES = TRUE)
Unfortunately I am not able to reproduce the fatal error with the toy data set and toy function. The only difference between the toy function and the real one is that there are more conditional counts on different variables, but the strategy for each one is exactly the same as shown above. I was originally getting a RHS / LHS class discrepancy error but this was discussed and resolved in this Stack Overflow post.
My real input data set is relatively large (2103 rows in the line listing, with reports in 155 weeks and four grouping vectors containing 1271, 144, 108 and 94 groups each, respectively).
I think the error might be due to the function timing out because there is too much data (I have 16GB ram and my .Rproj file is on a network drive) but is there any way to confirm this from the above error? Or is it a bug?
Any insights into why this is causing R to crash and how I could prevent this would be much appreciated - hope I have posted this in the right place as the error details did specify that the fault module name is
datatable.dll
The text was updated successfully, but these errors were encountered: