-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not fail hard if the scanner fails to resolve a revision #8212
Comments
Apparently, the NPM registry's metadata for Unfortunately, this is a rather common problem with NPM packages, where developers publish packages from a local state of the Git repository without ever pushing that state to the remote repository. And at least in the past, the NPM registry did not implement the properly sanity checks to prohibit the publication of such packages. That said, this kind of broken metadata for a published package can be fixed up via curations, which basically allows you to map a package version to a Git revision. Unfortunately, https://github.com/pubkey/unload/tags does not list any tags, so it's probably going to be hard to determine the correct revision. Maybe it's ca56fae861ba7d1178bbbf98a037c1dca893ebf0. Thank the NPM ecosystem for this mess 😭 |
@sschuberth Thank you for the really quick reply. I have suspected as much, since I've noticed those missing commits for some npm packages too. However, it seems like I would have hoped that the scanner can skip packages that have inaccurate information or do not provide a revision at all. In essence I would have to create curation entries for many of the transitive dependency? That's not feasible on any larger project where there can be hundreds or even thousands of them. Is there any other way to work around this, e.g. tell the JGit and the scanner to skip them automatically? |
For
That would be super-dangerous as it would result in incomplete scan results and possibly missing license information. ORT deliberately fails hard here to ensure these issues are being dealt with via ORT configuration means.
That why we have https://github.com/oss-review-toolkit/ort-config/tree/main/curations which already contains hundreds of curations. As a joint community effort, this is feasible to do. If people would simply start to contribute instead of saying it's not feasible 😉
Not right now. The most transparent way to fix this is by contributing curations. However, if #3537 was implemented, that'd be an option as well. |
To lead with good example, I've started to address this in the Open Source way by contributing both a fix for future |
Though we could discuss to make these issues (of |
I was under the impression that using
Obviously it should not fail silently and report errors. In my opinion the missing or inaccurate revisions on some packages should not result in the scan to fails as a whole. Perhaps the exception handling could be adjusted as you've suggested yourself. From my point of view the information gathered by the scanner are an addition to the information already found by the analyzer. If it is not possible for some then that is ok, as long as the user is aware of it not being able to retrieve additional data for a package.
I appreciate the FOSS spirit, I've been working on Blender and other OSS software before, but if I suggest that approach to my coworker on a project with 2400 transitive dependencies, then I'm afraid he won't like me very much. Over all it is a good idea to collect that information, but it would be great to have a user friendly way to skip or exclude them when needed without having to manually list every single package. As you've said yourself, in some cases the commit may be missing entirely from the public repo and as such no truly correct remediation exists (short of manually evaluating the repository, but even then the exact commit might not be clear).
Could be, some form a transitive depth filtering might make sense here as well.
Thank you!
That would really help, because then it could at least run successfully and include data on dependencies it did find, such as the one directly listed in the package.json. |
Let's phrase it like this: I'm not sure whether also JGit picks it up correctly, and whether you have configured at the right place (esp. if you run ORT via Docker).
Ping @mnonnenmacher as a reminder. |
Any opinion here @mnonnenmacher? |
The provenance resolver throws an exception but the scanner catches that and turns it into an issue. It does not abort on provenance resolutions failures. |
But doesn't it look different from the console output? Are you saying this issue is invalid? |
At least I cannot remember seeing a failed scanner run because of a provenance resolution error and the code looks correct to me. So we would need steps to reproduce the issue. |
@robertguetzkow as we're on ORT 41.0.0 by now, and many things have changed, could you please provide feedback whether you can still reproduce the issue, and provide the necessary steps to do so? |
@sschuberth This was done during a tool evaluation at my employer. I currently do not have the capacity to retry this at work. If I do get a chance to test with a current version, I will let you know. |
Thanks for the response @robertguetzkow. Let me close this then for now to clean up our backlog. On a related note, if you can share some feedback from your evaluation, feel free to start a discussion. |
Allow to use the package metadata for anonymous clones, which solves issues like the one described at [1]. For convenience, use the short-hand notation described at [2]. [1]: oss-review-toolkit/ort#8212 [2]: https://docs.npmjs.com/cli/v10/configuring-npm/package-json#repository
Allow to use the package metadata for anonymous clones, which solves issues like the one described at [1]. For convenience, use the short-hand notation described at [2]. [1]: oss-review-toolkit/ort#8212 [2]: https://docs.npmjs.com/cli/v10/configuring-npm/package-json#repository
I am testing ORT on a simple
package.json
that contains one dependency. Everything works correctly, except for the scanning based on ScanCode. JGit cannot checkout the correct revision of transitive dependencies, seemingly because the analyzer has either not found a particular revision or it is given a commit hash but does not find it.I'm currently testing this in combination with the ort-ci-gitlab pipeline. I have modified the
ort-scan.yml
to use the minimal container image, activate the scanner and use local file storage instead of a PostgreSQL database.Could you please help me identify whether my configuration is incorrect or if this is a bug?
package.json
ort-test.zip
The text was updated successfully, but these errors were encountered: