Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend ScanSummary to be able to return findings from snippet scanners #6791

Merged

Conversation

nnobelis
Copy link
Member

@nnobelis nnobelis commented Apr 3, 2023

This pull request extends ORT Data Model to be able to capture snippet from snippets scanner such as FossID or ScanOSS.

Please have a look at the individual commit messages for the details.

Fixes: #3265.

@nnobelis nnobelis requested a review from a team as a code owner April 3, 2023 08:31
@nnobelis nnobelis force-pushed the nnobelis/snippet_model branch 2 times, most recently from 190be3e to d7fb3a7 Compare April 3, 2023 09:30
@codecov
Copy link

codecov bot commented Apr 3, 2023

Codecov Report

Patch coverage: 64.92% and project coverage change: -0.07 ⚠️

Comparison is base (93d0c66) 64.34% compared to head (d1fb1da) 64.28%.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #6791      +/-   ##
============================================
- Coverage     64.34%   64.28%   -0.07%     
- Complexity     1960     1976      +16     
============================================
  Files           327      331       +4     
  Lines         16518    16717     +199     
  Branches       2361     2392      +31     
============================================
+ Hits          10629    10746     +117     
- Misses         4870     4930      +60     
- Partials       1019     1041      +22     
Flag Coverage Δ
funTest-docker 58.97% <ø> (ø)
funTest-non-docker 29.78% <8.20%> (-0.46%) ⬇️
test 40.09% <64.92%> (+0.40%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
model/src/main/kotlin/utils/Snippet.kt 0.00% <0.00%> (ø)
model/src/main/kotlin/utils/SnippetFinding.kt 0.00% <0.00%> (ø)
model/src/main/kotlin/utils/SortedSetConverters.kt 0.00% <0.00%> (ø)
scanner/src/main/kotlin/scanners/fossid/FossId.kt 74.87% <64.78%> (-2.48%) ⬇️
...ain/kotlin/scanners/scanoss/ScanOssResultParser.kt 82.50% <75.00%> (-6.64%) ⬇️
...in/kotlin/provenance/NestedProvenanceScanResult.kt 83.16% <85.71%> (+0.40%) ⬆️
...c/main/kotlin/scanners/fossid/FossIdScanResults.kt 77.14% <100.00%> (+0.67%) ⬆️

... and 4 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@nnobelis nnobelis force-pushed the nnobelis/snippet_model branch from d7fb3a7 to 881adae Compare April 3, 2023 09:57
model/src/main/kotlin/utils/SnippetFinding.kt Outdated Show resolved Hide resolved
@nnobelis nnobelis force-pushed the nnobelis/snippet_model branch from 881adae to 14ce396 Compare April 4, 2023 11:43
@nnobelis nnobelis requested a review from fviernau April 4, 2023 11:45
@nnobelis nnobelis force-pushed the nnobelis/snippet_model branch from 14ce396 to 707c281 Compare April 5, 2023 05:02
@sschuberth
Copy link
Member

@oss-review-toolkit/core-devs I currently don't have time to review this. I'd prefer to not merge this until we had a chance to discuss this at least in a Kotlin developer meeting after the Easter vacation.

@nnobelis nnobelis force-pushed the nnobelis/snippet_model branch 3 times, most recently from 6ed6e8e to 6dc270a Compare April 12, 2023 14:04
@nnobelis nnobelis force-pushed the nnobelis/snippet_model branch 2 times, most recently from 09f2b20 to 8ba30e4 Compare April 24, 2023 11:24
@nnobelis nnobelis force-pushed the nnobelis/snippet_model branch from 8ba30e4 to 23d5a5d Compare May 8, 2023 06:36
@nnobelis nnobelis requested review from tsteenbe and sschuberth May 8, 2023 07:01
@fviernau
Copy link
Member

fviernau commented May 8, 2023

@nnobelis is this up for review? / should it be turned into draft until ready?

@nnobelis
Copy link
Member Author

nnobelis commented May 8, 2023

@fviernau It is ready for review

model/src/main/kotlin/ScanSummary.kt Show resolved Hide resolved
model/src/main/kotlin/utils/Snippet.kt Outdated Show resolved Hide resolved
model/src/main/kotlin/utils/Snippet.kt Outdated Show resolved Hide resolved
model/src/main/kotlin/utils/Snippet.kt Outdated Show resolved Hide resolved
scanner/src/main/kotlin/scanners/fossid/FossId.kt Outdated Show resolved Hide resolved
scanner/src/main/kotlin/scanners/fossid/FossId.kt Outdated Show resolved Hide resolved
scanner/src/main/kotlin/scanners/fossid/FossId.kt Outdated Show resolved Hide resolved
@nnobelis nnobelis force-pushed the nnobelis/snippet_model branch from 23d5a5d to 15594fe Compare May 11, 2023 06:01
This reduces code duplication and will allow to create single issue
summaries in other locations of the FossID scanner.

Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
@nnobelis nnobelis force-pushed the nnobelis/snippet_model branch 3 times, most recently from ac74720 to c8af208 Compare May 11, 2023 10:20
Copy link
Member

@sschuberth sschuberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mnonnenmacher and @fviernau, IMO this is close to ready for being merged. Please also have a look.

/**
* The provenance of the snippet, either an artifact or a repository.
*/
val provenance: Provenance,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this property is not really used yet, I'm wondering how this could be an (unknown) provenance. Does one of the snippet scanners really return snippets where it doesn't know where this comes from? If so, could you maybe explain this in the docs above?

Copy link
Member Author

@nnobelis nnobelis May 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnknownProvenance can happen when FossID url property is null: see https://github.com/bosch-io/oss-review-toolkit/blob/cbc984df1bf131a80c32cea8a224cbec9d22f86c/scanner/src/main/kotlin/scanners/fossid/FossId.kt#L825.

Once again we are paying the price of the lack of data model for FossID....

However, in the various responses I get from the FossID API service, it seems the url is never null so maybe we were too cautious making the property nullable in the first place.
What do you think ? Should we make the property non nullable (in a separte commit) and then replace the UnknownProvenance by a KnownProvenance ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make the property non nullable (in a separte commit) and then replace the UnknownProvenance by a KnownProvenance ?

I'd love to be able to do that, yes. Could you try to reach our to FossId to learn if a non-null URL is a valid assumption?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I opened a FossID support ticket 👍

model/src/main/kotlin/ScanSummary.kt Outdated Show resolved Hide resolved
model/src/main/kotlin/utils/Extensions.kt Show resolved Hide resolved
scanner/src/main/kotlin/scanners/fossid/FossId.kt Outdated Show resolved Hide resolved
scanner/src/main/kotlin/scanners/fossid/FossId.kt Outdated Show resolved Hide resolved
scanner/src/main/kotlin/scanners/fossid/FossId.kt Outdated Show resolved Hide resolved
PackageProvider.PACKAGIST -> PurlType.COMPOSER
PackageProvider.PYPI -> PurlType.PYPI
PackageProvider.RUBYGEMS -> PurlType.GEM
null -> null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just drop this and rely on the else below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added that because the tests were broken. As they use pseudo URLs, PackageProvider.get returned null and an issue was filled.
However I think filling an issue when the URL cannot be mapped is good practice so I will fix the tests instead.

scanner/src/main/kotlin/scanners/fossid/FossId.kt Outdated Show resolved Hide resolved
@nnobelis nnobelis force-pushed the nnobelis/snippet_model branch from c8af208 to fb0cc4e Compare May 11, 2023 12:28
@nnobelis nnobelis requested a review from sschuberth May 11, 2023 13:41
sschuberth
sschuberth previously approved these changes May 11, 2023

val allFindings = mutableSetOf<SnippetFinding>()

findingsByPath.map { (path, findings) ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use forEach instead as you don't use the result. You could also use buildSet:

return buildSet {
  findingsByPath.forEach { (path, findings) ->
    ...
    addAll(findings.map { ... })
}

allFindings += findings.map { finding ->
val newPath = "$prefix${finding.sourceLocation.path}"
finding.copy(sourceLocation = finding.sourceLocation.copy(path = newPath))
}.toSet()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove toSet() because there is no need to convert to set before adding the collection to a set.

model/src/main/kotlin/utils/Extensions.kt Show resolved Hide resolved
nnobelis added 4 commits May 12, 2023 14:17
Snippet scanners such as ScanOSS [1] and FossID [2] can identify code
snippets potentially coming from a third party source. To do so, they
scan the Internet for source code and build a Knowledge Base (KB). Then,
the source code to check for snippets is scanned and compared against
this KB.

Snippet Findings are difference in nature from License and Copyright
findings as they reference a third party sourcecode.

Therefore, this commit adds a new property ORT data model in the
`ScanSummary` to carry these snippet findings. This model has been
created by comparing the results from FossID and ScanOSS and trying to
find a common abstraction. This is currently the minimal model required
to handle snippets. Further properties will be added in the future.

Blackduck [3] is another scanner considered for integration in ORT [4]
which supports snippets. However since it does not deliver snippets
through its API, it was not considered when designing the snippet data
model for ORT.

Fixes: oss-review-toolkit#3265.

[1]: https://www.scanoss.com/
[2]: https://fossid.com/
[3]: https://www.synopsys.com/software-integrity/security-testing/software-composition-analysis.html
[4]: oss-review-toolkit#4632

Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
FossID returns packages containing URLs of these two websites. Since
FossID does not natively support Purls, a mapping needs to be done from
the native URL to the Purl, using the `PackageProvider` enum.
Such mapping will be added in a future commit.

See https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst.

Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
When FossId identifies a file matching snippets, it is a pending file.
An operator needs to log to FossID UI and use the license of a snippet
or manually enter difference license information. Then the file is
marked as "identified".

Currently, the FossID scanner in ORT returns the list of all pending
files in `ScanSummary` issues, with a severity of `HINT`. This commit
maps the snippets of pending files using the newly-created snippet data
model.

The pending files are still listed as issues: This will be removed in a
future commit as it is a breaking change.

Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
This commits maps the snippets in a ScanOSS response using the
newly-created snippet data model.

Please note that the snippet's license in the test data file has been
manipulated to be a license not present in the other identifications
of this file. This allows to demonstrate that license findings and
snippet findings are disjoint in ORT, even if they are returned together
by ScanOSS.

Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
@sschuberth sschuberth dismissed tsteenbe’s stale review May 12, 2023 12:45

Comments were addressed, partly in a meeting.

@MarcelBochtler MarcelBochtler merged commit b72b5af into oss-review-toolkit:main May 15, 2023
@MarcelBochtler MarcelBochtler deleted the nnobelis/snippet_model branch May 15, 2023 07:35
nnobelis added a commit to boschglobal/oss-review-toolkit that referenced this pull request May 19, 2023
This is a follow-up of the discussion at [1]: Since FossID does not
document the data model of the API response, the `url` property of the
`Snippet` has been defensively made nullable.
Since then, FossID support confirmed that the property is never `null`
in the API responses: While the database schema indeed has this property
as nullable, the CLI tool used by the scanner guarantee it is never
null.

[1]: oss-review-toolkit#6791 (comment)

Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
nnobelis added a commit to boschglobal/oss-review-toolkit that referenced this pull request May 19, 2023
This is a follow-up of the discussion at [1]: Since FossID does not
document the data model of the API response, the `url` property of the
`Snippet` has been conservatively made nullable.
Since then, FossID support confirmed that the property is never `null`
in the API responses: While the database schema indeed has this property
as nullable, the CLI tool used by the scanner guarantees it is never
null.

[1]: oss-review-toolkit#6791 (comment)

Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
nnobelis added a commit to boschglobal/oss-review-toolkit that referenced this pull request May 19, 2023
This is a follow-up of the discussion at [1]: Since FossID does not
document the data model of the API response, the `url` property of the
`Snippet` has been conservatively made nullable.
Since then, FossID support confirmed that a null URL will never be
exposed via the API response.

[1]: oss-review-toolkit#6791 (comment)

Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
sschuberth pushed a commit that referenced this pull request May 19, 2023
This is a follow-up of the discussion at [1]: Since FossID does not
document the data model of the API response, the `url` property of the
`Snippet` has been conservatively made nullable.
Since then, FossID support confirmed that a null URL will never be
exposed via the API response.

[1]: #6791 (comment)

Signed-off-by: Nicolas Nobelis <nicolas.nobelis@bosch.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend ScanSummary to be able to return findings from snippet scanners
6 participants