-
Notifications
You must be signed in to change notification settings - Fork 997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create rchk GHA #6263
Create rchk GHA #6263
Conversation
This runs surprisingly quickly (3m). Maybe we can just run it every time. |
Generated via commit c5d5241 Download link for the artifact containing the test results: ↓ atime-results.zip Time taken to finish the standard R installation steps: 11 minutes and 47 seconds Time taken to run |
@jangorecki indeed rchk did not like our changes in #4386 😆 |
@kalibera we're getting hit multiple times now with "unsupported form of unprotect, unprotecting all variables" I see that marked as Or is there some different recommendation for how we're doing protect/unprotect in those functions (I have pored over several times & don't see anything unusual...) |
Reading rchk code a bit more, I guess this is just false positive related to hitting the max protect stack depth (64): So we should ignore it. That said, I don't really see why we're hitting a stack depth issue, we only |
The message means that rchk doesn't understand this code, so it cannot check it. I would consider simplifying the code: rchk has been tuned to understand most PROTECT/UNPROTECT patterns present in R and R packages, and making the code simpler for the tool often makes it also simpler cognitively for humans. The FIXME means that indeed, more patterns that those supported by rchk may be correct.
I only had a quick look at your function (in the code being checked by rchk), but I would use a simple protection counter. That is, unprotect using "UNPROTECT(nprotect)" unconditionally and whenever calling PROTECT(), increment also nprotect to reflect the number of pointers protected. |
I haven't tried to debug this case, but I don't think this is related: whether a form of PROTECT/UNPROTECT is supported is not related to the maximum supported pointer protection stack when modeling which variables are protected. Yes, one can ignore the cases when rchk doesn't understand the code (for the price of potentially missing some true errors). One can also simplify the code to use the supported forms of PROTECT/UNPROTECT, which is what I would do in this case. To know exactly what is going on, one needs to look at the assembly (the LLVM IR) of the code being checked, possibly enable diagnostic output in rchk and possibly run in a debugger. In some cases in the past, new problems about "unsupported form" of UNPROTECT etc appeared after changes in LLVM due to which the LLVM IR looked different from what it has been when the corresponding rchk code was implemented. In such a case it is worth fixing in rchk, if possible. If you ran into such a case and have a reproducible example, I am happy to have a look (or have a look at a patch). |
Interestingly we are running into the same error as Looks like a bug in I also stumbled upon #4625 which suggests everything is fine 🔥 |
Yea, I saw that, possibly the docker image is just under-configured. But for {rchk}, all that matters is the C code builds & can be analyzed, which it does; my read is this is a consequence of this note in the rchk README:
|
In the checks you reference, "arrow" fails to build because libarrow is not available:
That needs to be analyzed looking at how the arrow package installs (and, as Michael writes, focus on --libs-only). Re "data.table". First in principle, you might want to have a less stringent check in the container. There is some code that the tool won't understand. In some cases, you probably want to simplify the code, but in some cases it might be too intrusive. You could even have a stoplist for certain functions. It is not a bug of "rchk" that in some cases it is not sure about the code. Any bug finding tool in principle works like that. In all cases, I would though recommend checking the rchk reports time to time manually. Often one can find true errors by looking into functions which rchk cannot analyze. Concretely, in these cases, it seems that the code needs more work updating:
^^^ this function seems to have some UNPROTECT calls using a constant literal without an update to the protection counter ("nprotect"), so the code doesn't seem correct. I would simply UNPROTECT(nprotect) at all local returns from the function and count all protections by incrementing "nprotect".
^^^ there is still a mixture in the code including the protection counter ("nprotect") and the old "listprotect" variable; probably switching fully to "nprotect" would solve the issue. As above, I would increment nprotect at all protections and then UNPROTECT(nprotect) at all local returns.
This is yet another case:
This is an error during the checking, not an error of the checked code. It means rchk cannot analyze some of these functions because it has already exhausted the number of states it is allowed to use to represent the code. This can happen in some complicated loops (with complicated conditions around memory protection functions), etc - and often in cases when you wouldn't want to rewrite the source code, when it is as said a limited expressiveness of the abstractions used for the analyzed code. One can increase the number of states, but it often wouldn't help. In principle, again, such issues are inevitable in bug finding tools based on static analysis. You might want to have a quick look at these functions (those of them from the package), and it may be that you will find some issues in protection, and it certainly may be that after you fix the issues in rbindlist, the problem would disappear. But it may be some of these won't be fixable, and certainly at least those referring to functions from base R. I would ignore those. |
ah, of course very much appreciate the deeper review, thank you for your time! but just to clarify, we are referring to the rchk log having "Could not install package data.table" being the putative bug, not the rchk analysis. sorry for the mixup. |
Merging now to get this going on future PRs. More to do here, as noted, but glad this GHA is "prod-ready" |
Part of #6257.