-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend ScanSummary to be able to return findings from snippet scanners #6791
Extend ScanSummary to be able to return findings from snippet scanners #6791
Conversation
190be3e
to
d7fb3a7
Compare
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #6791 +/- ##
============================================
- Coverage 64.34% 64.28% -0.07%
- Complexity 1960 1976 +16
============================================
Files 327 331 +4
Lines 16518 16717 +199
Branches 2361 2392 +31
============================================
+ Hits 10629 10746 +117
- Misses 4870 4930 +60
- Partials 1019 1041 +22
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
d7fb3a7
to
881adae
Compare
881adae
to
14ce396
Compare
14ce396
to
707c281
Compare
@oss-review-toolkit/core-devs I currently don't have time to review this. I'd prefer to not merge this until we had a chance to discuss this at least in a Kotlin developer meeting after the Easter vacation. |
6ed6e8e
to
6dc270a
Compare
09f2b20
to
8ba30e4
Compare
8ba30e4
to
23d5a5d
Compare
@nnobelis is this up for review? / should it be turned into draft until ready? |
@fviernau It is ready for review |
23d5a5d
to
15594fe
Compare
This reduces code duplication and will allow to create single issue summaries in other locations of the FossID scanner. Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
ac74720
to
c8af208
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mnonnenmacher and @fviernau, IMO this is close to ready for being merged. Please also have a look.
/** | ||
* The provenance of the snippet, either an artifact or a repository. | ||
*/ | ||
val provenance: Provenance, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this property is not really used yet, I'm wondering how this could be an (unknown) provenance. Does one of the snippet scanners really return snippets where it doesn't know where this comes from? If so, could you maybe explain this in the docs above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UnknownProvenance
can happen when FossID url
property is null
: see https://github.com/bosch-io/oss-review-toolkit/blob/cbc984df1bf131a80c32cea8a224cbec9d22f86c/scanner/src/main/kotlin/scanners/fossid/FossId.kt#L825.
Once again we are paying the price of the lack of data model for FossID....
However, in the various responses I get from the FossID API service, it seems the url is never null
so maybe we were too cautious making the property nullable in the first place.
What do you think ? Should we make the property non nullable (in a separte commit) and then replace the UnknownProvenance
by a KnownProvenance
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make the property non nullable (in a separte commit) and then replace the
UnknownProvenance
by aKnownProvenance
?
I'd love to be able to do that, yes. Could you try to reach our to FossId to learn if a non-null URL is a valid assumption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I opened a FossID support ticket 👍
PackageProvider.PACKAGIST -> PurlType.COMPOSER | ||
PackageProvider.PYPI -> PurlType.PYPI | ||
PackageProvider.RUBYGEMS -> PurlType.GEM | ||
null -> null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just drop this and rely on the else
below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added that because the tests were broken. As they use pseudo URLs, PackageProvider.get
returned null
and an issue was filled.
However I think filling an issue when the URL cannot be mapped is good practice so I will fix the tests instead.
c8af208
to
fb0cc4e
Compare
|
||
val allFindings = mutableSetOf<SnippetFinding>() | ||
|
||
findingsByPath.map { (path, findings) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use forEach
instead as you don't use the result. You could also use buildSet
:
return buildSet {
findingsByPath.forEach { (path, findings) ->
...
addAll(findings.map { ... })
}
allFindings += findings.map { finding -> | ||
val newPath = "$prefix${finding.sourceLocation.path}" | ||
finding.copy(sourceLocation = finding.sourceLocation.copy(path = newPath)) | ||
}.toSet() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove toSet()
because there is no need to convert to set before adding the collection to a set.
Snippet scanners such as ScanOSS [1] and FossID [2] can identify code snippets potentially coming from a third party source. To do so, they scan the Internet for source code and build a Knowledge Base (KB). Then, the source code to check for snippets is scanned and compared against this KB. Snippet Findings are difference in nature from License and Copyright findings as they reference a third party sourcecode. Therefore, this commit adds a new property ORT data model in the `ScanSummary` to carry these snippet findings. This model has been created by comparing the results from FossID and ScanOSS and trying to find a common abstraction. This is currently the minimal model required to handle snippets. Further properties will be added in the future. Blackduck [3] is another scanner considered for integration in ORT [4] which supports snippets. However since it does not deliver snippets through its API, it was not considered when designing the snippet data model for ORT. Fixes: oss-review-toolkit#3265. [1]: https://www.scanoss.com/ [2]: https://fossid.com/ [3]: https://www.synopsys.com/software-integrity/security-testing/software-composition-analysis.html [4]: oss-review-toolkit#4632 Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
FossID returns packages containing URLs of these two websites. Since FossID does not natively support Purls, a mapping needs to be done from the native URL to the Purl, using the `PackageProvider` enum. Such mapping will be added in a future commit. See https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst. Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
When FossId identifies a file matching snippets, it is a pending file. An operator needs to log to FossID UI and use the license of a snippet or manually enter difference license information. Then the file is marked as "identified". Currently, the FossID scanner in ORT returns the list of all pending files in `ScanSummary` issues, with a severity of `HINT`. This commit maps the snippets of pending files using the newly-created snippet data model. The pending files are still listed as issues: This will be removed in a future commit as it is a breaking change. Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
This commits maps the snippets in a ScanOSS response using the newly-created snippet data model. Please note that the snippet's license in the test data file has been manipulated to be a license not present in the other identifications of this file. This allows to demonstrate that license findings and snippet findings are disjoint in ORT, even if they are returned together by ScanOSS. Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
fb0cc4e
to
d1fb1da
Compare
Comments were addressed, partly in a meeting.
This is a follow-up of the discussion at [1]: Since FossID does not document the data model of the API response, the `url` property of the `Snippet` has been defensively made nullable. Since then, FossID support confirmed that the property is never `null` in the API responses: While the database schema indeed has this property as nullable, the CLI tool used by the scanner guarantee it is never null. [1]: oss-review-toolkit#6791 (comment) Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
This is a follow-up of the discussion at [1]: Since FossID does not document the data model of the API response, the `url` property of the `Snippet` has been conservatively made nullable. Since then, FossID support confirmed that the property is never `null` in the API responses: While the database schema indeed has this property as nullable, the CLI tool used by the scanner guarantees it is never null. [1]: oss-review-toolkit#6791 (comment) Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
This is a follow-up of the discussion at [1]: Since FossID does not document the data model of the API response, the `url` property of the `Snippet` has been conservatively made nullable. Since then, FossID support confirmed that a null URL will never be exposed via the API response. [1]: oss-review-toolkit#6791 (comment) Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
This is a follow-up of the discussion at [1]: Since FossID does not document the data model of the API response, the `url` property of the `Snippet` has been conservatively made nullable. Since then, FossID support confirmed that a null URL will never be exposed via the API response. [1]: #6791 (comment) Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
This pull request extends ORT Data Model to be able to capture snippet from snippets scanner such as FossID or ScanOSS.
Please have a look at the individual commit messages for the details.
Fixes: #3265.