-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
comments about data table mutation testing blog #12
Comments
also please add a section which discusses the PR that I filed based on the mutation testing results, Rdatatable/data.table#6115 |
How could someone pick mutants in the future, to maximize their likelihood of picking one that could result in creating a new test/PR? |
Yes this/here is perfectly appropriate!
Sure It's worth investigating some of the mutants (not all, especially since some aren't even within the execution path or out of coverage), and I think there are a few interesting ones out there. I want to delve deeper into this and create tests/PRs, but it would be helpful for me to have a slightly detailed example of going about one test case. For e.g., after you make changes in the C file (or if we even need to do so?), whether you recompile C code manually or just uninstall and install
Some of them don't have output sections (a few do have O/P) since I was just sharing the test cases I wrote, but you're right that it might be useful or at least for consistency to have them all showing output (I'll rerun them and post the results)
Thanks for pointing that out, will do! |
I made the suggested changes, can you please check again? |
please link the diff |
O/P additions and some minor changes aside, it would be these two sections: |
headers typo rblindlist meaning it’s hard to get construct tests -> get or construct? I will maybe try fuzzing inputs -> I don't think that is relevant, please delete or clarify your proposition here. Fuzzing is a different kind of testing. With mutation tests we are supposed to be able to examine the mutants to determine new tests/inputs (random generation should not be required). if you really were using .External can you please give a detailed example about how that worked? (code + output) I still don't see an answer to these questions in particular please discuss the mutants below, which also involve changing a binary comparison operator. I managed to create a new test/PR for these, and you did not, why? rbindlist
subset
fifelse
my fread
also for the rbindlist example, you should mention that it is not possible to have a data table with ncol=0 and nrow>0 |
Thanks!
construct, Thanks!
I deleted it, and it was to help with input generation, as like you said too with mutation testing we are trying to 'determine new tests/inputs', and it's kinda tricky to think of different inputs for some of the mutant cases we see.
There is already an example I provided for the library(data.table)
options(datatable.optimize=1)
meanComparison <- function(x, ...)
{
baseR <- mean(x, ...)
fastmean <- .External("Cfastmean", x, ...)
cat("Results as computed by:\nBase R's mean:", baseR, "\ndata.table's fast mean:", fastmean, "\n")
fifelse(identical(baseR, fastmean), "Passed", "Failed")
}
testInputs <- list(
c(rep(1e308, 1e3), rep(-1e308, 1e3)),
c(rnorm(1e6, mean = 0, sd = 1e5), rep(1, 1e6), .Machine$double.xmax)
)
for(i in seq_along(testInputs))
{
cat("Test case ", i, ":\n", sep = "")
cat(meanComparison(testInputs[[i]], na.rm = TRUE), "\n")
} O/P:
But I did write about why the As for
This I don't have a clear answer to given my inexperience in doing so (but I will try to based on the reasoning above)
Yup that is exactly what I meant (or that since |
I made the changes a while ago, please take a look! (and let me know if it answers your questions or not - I'll follow up tomorrow; Conclusions link for quick reference) |
great thanks |
how can your blog be found?
|
Good point, and done for both! |
hi @Anirban166
this is a comment about https://anirban166.github.io/data%20table/Mutation%20Testing/
I am posting here because it says on this page https://anirban166.github.io/data%20table/ to "post an issue on the repositories I’m working on" including this one.
Can you please add a section Conclusions which summarizes your experience overall?
How many mutants did you investigate? How many resulted in PRs that added new test cases? How many did not? and why not?
What would you do differently next time for investigating the other mutants?
or do you think it is not worth it to investigate the others?
Also the following is non-standard, so you should explain better why you did this instead of something more standard. To re-compile R packages the standard way is just R CMD INSTALL, or data.table uses cc(), or Rstudio uses devtools::load_all(). You should not have to use R CMD SHLIB, nor dyn.load. Also Why .External? (.Call is more common) "resorted to using R CMD SHLIB and then R CMD INSTALL to generate the shared object which I then loaded onto the R session via dyn.load. For functions that do not have a function to call them directly in R code, I combine that with a wrapper (using .Call) to call the corresponding C routine from within R (given that there mostly aren’t exported objects in the data.table namespace for some those functions, e.g. fast mean). I then switched to using .External to call C routines via symbol names."
Finally in all of the code blocks, it would be useful to see the output of the commands (right now only the code is shown, with no output).
Also please edit the comment in Rdatatable/data.table#6114 to re-categorize the ones you have investigated, which are probably currently under "Not classified yet (TODO)"
The text was updated successfully, but these errors were encountered: