Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

helper-cli: Add a command to generate a flat analyzer result #7305

Merged
merged 2 commits into from
Sep 12, 2023

Conversation

fviernau
Copy link
Member

@fviernau fviernau commented Jul 18, 2023

The command defines a data model called SimpleDefinitionFile which is
read as input using any of the known file extensions. It outputs an
analyzer result with a single project containing the provided
dependencies as a flat list.

When a project does not use a package manager, currently the only way of
scanning it with ORT is to craft SPDX files and running the analyzer
against them. This command tries to provide a minimalistic alternative
to fullfil the requirements of the underlying need.

Note: This is a first experimental version. Furthermore, VcsInfo is deliberately not used, in order to:

      1. avoid nesting to in turn make it easier to generate the input file.
      2. make all attributes optional.

@fviernau fviernau requested a review from a team as a code owner July 18, 2023 15:02
@codecov
Copy link

codecov bot commented Jul 18, 2023

Codecov Report

Patch coverage: 88.54% and project coverage change: +7.07% 🎉

Comparison is base (72a5644) 60.93% compared to head (3f8030f) 68.01%.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #7305      +/-   ##
============================================
+ Coverage     60.93%   68.01%   +7.07%     
- Complexity     1968     2023      +55     
============================================
  Files           338      339       +1     
  Lines         16623    16719      +96     
  Branches       2363     2371       +8     
============================================
+ Hits          10129    11371    +1242     
+ Misses         5516     4363    -1153     
- Partials        978      985       +7     
Flag Coverage Δ
funTest-docker 69.33% <ø> (ø)
funTest-non-docker 36.46% <88.54%> (+6.98%) ⬆️
test 36.10% <0.00%> (-0.40%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
...ands/CreateAnalyzerResultFromPackageListCommand.kt 88.54% <88.54%> (ø)

... and 52 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@fviernau fviernau force-pushed the helper-create-ort-file-from-simple-definition-file branch 3 times, most recently from 37529d0 to 9084ad9 Compare July 19, 2023 09:03
@sschuberth
Copy link
Member

This command tries to provide a minimalistic alternative
to fullfil the requirements of the underlying need.

Typo in "fullfil", should be "fulfill".

In general this sentence is a bit cryptic. Please be transparent about what the "underlying need" is and document it in the commit message.

@sschuberth
Copy link
Member

sschuberth commented Jul 19, 2023

currently the only way of
scanning it with ORT is to craft SPDX files and running the analyzer
against them.

I want to start by saying that I agree that writing / generating SPDX files as a substitute for a lack of supported definition files for ORT is cumbersome, mostly because SPDX is such a terrible format IMO. Maybe we should have instead used AboutCode Data or an ORT-specific "fake definition file" format that's easier to use than writing whole analyzer result files from hand.

That said, I believe this goal of this PR should be implemented differently: As a company-internal package manager plugin that uses your SimpleDefinitionFile as definition files. How do @mnonnenmacher and @MarcelBochtler think about this?

@fviernau
Copy link
Member Author

fviernau commented Jul 19, 2023

That said, I believe this goal of this PR should be implemented differently: As a company-internal package manager plugin that uses your SimpleDefinitionFile as definition files

  1. I thought it makes sense to try it out as helper-cli command first before turning it into a package manager plugin, as I believe it is quicker to develop and more light-weight. It is also slightly different than package managers, as it is limited to a single project and it does not require any cloning of any repo for the analysis. Anyhow, I planned to refactor this in future when it matured a bit, into a package manager (disabled by default) if it turns out to make sense.
  2. Why do you think it should remain company internal? Do you think nobody would be interrested in using it in the long run?

@fviernau fviernau force-pushed the helper-create-ort-file-from-simple-definition-file branch from 9084ad9 to dfd5531 Compare July 19, 2023 10:14
@sschuberth
Copy link
Member

sschuberth commented Jul 19, 2023

  1. Why do you think it should remain company internal?

Mostly because of the "I believe it is quicker to develop" rationale you mentioned: Doing this as a company-internal package manager plugin (which is something e.g. Bosch has done, too) allows you to fulfill your custom needs quicker without the need for justifications during a review process. And you still keep the option to open-source it later on.

Do you think nobody would be interrested in using it in the long run?

Hard to say for the long term if more features / generalizations get added, but in its current shape it indeed looks like a very specific solution that probably no one except EPAM is interested in.

@fviernau
Copy link
Member Author

Mostly because of the "I believe it is quicker to develop" rationale you mentioned

I think we've added "experimental" things to ORT / helper-cli already before this PR.

@sschuberth
Copy link
Member

Mostly because of the "I believe it is quicker to develop" rationale you mentioned

I think we've added "experimental" things to ORT / helper-cli already before this PR.

Right, but this is not about the level of "experimentality", but about where the code fits, whatever the maturity is. And to me a dedicated package manager so clearly is a better fit (if I'm getting the use-case right), that personally I'd draw a line here. Otherwise Bosch, Porsche and others could have contributed their plugins also as helper-cli commands (but we've turned them down).

@tsteenbe
Copy link
Member

tsteenbe commented Jul 19, 2023

Frank forgot to mention we also want to use this new helper functionality to make it more easier to implement org-wide inbound FOSS scanning or pre-scanning.

Org-wide inbound FOSS scanning is something we started working whilst at HERE, the idea is to scan a FOSS package the moment it's introduced into the organization.

Scenario: Developer does dependencies update in his/her SW project -> packages are being fetched from internal package registry (JFrog Artifactory) -> Artifactory does not have version so retrieves package from public package registry (Maven Central, npmjs.com) and at the same time trigger inbound FOSS scan for each new version via REST API providing package url and VCS url -> inbound FOSS scan executes helper cli command to generate analyzer file defined in this PR -> Generated analyzer result is used to execute ORT scan with ScanCode -> Scan results are added to defined ORT scan storage -> when developer create pull/merge request ORT scan is done and new version of dependency is already scanned -> scan is quick resulting in better user experience and engineer productivity. If new version or package is not OK by ORT policy rules than one can configure Artifactory to block download pointing to the ORT report for details.

Similarly one can also implement pre-scanning - from Artifactory usage stats one can deduct most used FOSS package and then logic can be implement to trigger inbound FOSS ORT scan the moment package popular within the org release a new version.

@sschuberth
Copy link
Member

inbound FOSS scan executes helper cli command to generate analyzer file defined in this PR

How's the required SimpleDefinitionFile input provided in that scenario?

@mnonnenmacher
Copy link
Member

That said, I believe this goal of this PR should be implemented differently: As a company-internal package manager plugin that uses your SimpleDefinitionFile as definition files. How do @mnonnenmacher and @MarcelBochtler think about this?

That workflow makes sense to me, we have used it at Bosch as well, for example to implement and test snippet reporter related changes internally before contributing them upstream. This provides a lot of freedom for experiments (because no code reviews are required, for example) and makes it easier to reach a mature state that is worth contributing.

I think we've added "experimental" things to ORT / helper-cli already before this PR.

Overall I am not very happy with the state of the helper-cli, because also experimental commands still have to be maintained by the community, and because there are almost no tests. Adding more experiments doesn't improve the situation. Also, I have no idea how many of the commands are still used by anyone or if we maintain them just for the sake of it.

Why do you think it should remain company internal? Do you think nobody would be interested in using it in the long run?

At Bosch we have a very strong requirement for an alternative to the SPDX analyzer, because it makes some very unintuitive assumptions, it feels like 90% of the SPDX that has to be written is boilerplate, and it also does not support all ORT features. We have that topic on our Agenda for later this year, I would appreciate if we could discuss it in a community meeting after the summer break to collect requirements.

@tsteenbe
Copy link
Member

inbound FOSS scan executes helper cli command to generate analyzer file defined in this PR

How's the required SimpleDefinitionFile input provided in that scenario?

SimpleDefinitionFile can be used to batch scanning multiple packages, say for instance we need to scan an update of framework or SDK consisting out of multiple packages each with their own code repository. Also from infrastructure efficiency/limits perspective batching scan jobs together where possible makes sense to A) reduce impact of overhead of starting ORT Docker container and B) stay within limit of execution environment (# of concurrrent jobs / job pool availability).

@fviernau
Copy link
Member Author

I would appreciate if we could discuss it in a community meeting after the summer break to collect requirements.

I needed something quick in place to help during the next couple of weeks. I understand you guys don't want that, I'll just turn the PR into a draft and keep the branch around for while.

@fviernau fviernau marked this pull request as draft July 19, 2023 18:26
@fviernau fviernau force-pushed the helper-create-ort-file-from-simple-definition-file branch from dfd5531 to 0243449 Compare July 20, 2023 11:22
@fviernau fviernau force-pushed the helper-create-ort-file-from-simple-definition-file branch 2 times, most recently from 1209e92 to 3cf2d2a Compare July 20, 2023 21:56
@fviernau fviernau force-pushed the helper-create-ort-file-from-simple-definition-file branch 2 times, most recently from 28dae09 to 5c9c77b Compare August 18, 2023 12:55
@fviernau fviernau force-pushed the helper-create-ort-file-from-simple-definition-file branch 2 times, most recently from e5317f3 to 956f08b Compare September 6, 2023 09:14
@fviernau fviernau force-pushed the helper-create-ort-file-from-simple-definition-file branch 2 times, most recently from b253e95 to 2452324 Compare September 6, 2023 09:20
@fviernau
Copy link
Member Author

fviernau commented Sep 6, 2023

I've added a test which tests end to end through helper main, which illustrates the input format (and the expected output).

@fviernau fviernau added the enhancement Issues that are considered to be enhancements label Sep 6, 2023
@sschuberth
Copy link
Member

Please note the detekt issues (and rebase to get rid of an unrelated test failure).

@fviernau fviernau force-pushed the helper-create-ort-file-from-simple-definition-file branch 2 times, most recently from 36b52aa to 6e340da Compare September 6, 2023 11:36
@@ -0,0 +1,23 @@
---
name: "Example project name"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice to have this example now as part of the tests, thank you! This makes some stuff immediately visible to me that I wouldn't have noticed otherwise.

helper-cli/src/funTest/assets/package-list.yml Outdated Show resolved Hide resolved
helper-cli/src/funTest/assets/package-list.yml Outdated Show resolved Hide resolved
vcs_revision: "0000000000000000000000000000000000000000"
vcs_path: "vcs-path/dependency-one"
source_artifact_url: "https://example.org/example-dependency-one.zip"
is_excluded: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting: We usually do not support the exclusion of a dependency. I need to look further down to see how this is implemented then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one scope for excluded dependencies and another for non-excluded ones.

vcs_path: "vcs-path/dependency-one"
source_artifact_url: "https://example.org/example-dependency-one.zip"
is_excluded: true
is_dynamically_linked: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a chance that we might want to support sub-projects as dependencies? If so, I'd prefer this to be a regular PackageLinkage enum value.

Copy link
Member Author

@fviernau fviernau Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command targets the use case of doing a quick (as possible) scan and clearance for an arbitrary set of packages. In that use case it is unneeded. I also do not foresee to have subprojects.

From our discussion in the developer meeting, what I understood was, (from @mnonnenmacher) there is need for an analyzer which is a bit simpler than SPDX.
And that he preferred to have my simple use case be a helper-cli command, so that this simple use case would not conflict with @mnonnenmacher analyzer use case. So, given that I prefer we keep this helper-command to a minimum and add the more complex cases (like sub projects) to @mnonnenmacher 's analyzer use case if it fits there?!

@fviernau fviernau requested a review from sschuberth September 7, 2023 06:40
@fviernau fviernau force-pushed the helper-create-ort-file-from-simple-definition-file branch from 6e340da to 6c5fa55 Compare September 11, 2023 07:34
@fviernau fviernau force-pushed the helper-create-ort-file-from-simple-definition-file branch from 6c5fa55 to 995bdfe Compare September 11, 2023 07:51
@fviernau fviernau force-pushed the helper-create-ort-file-from-simple-definition-file branch from 995bdfe to c05897c Compare September 12, 2023 10:30
The command defines a data model called `PackageList` which is
read as input using any of the known file extensions. It outputs an
analyzer result with a single project containing the provided
dependencies as a flat list.

When a project does not use a package manager, currently the only way of
scanning it with ORT is to craft SPDX files and running the analyzer
against them. This command tries to provide a minimalistic alternative
to fulfill the need to scan any project which does not use any package
manager (supported by ORT). The package metadata holds attributes which
are usually relevant for license clearance, namely the excluded state
and the type of linkage. So, it can be useful to do license clearance
for arbitrary sets of packages. This is for now limited to detected
license clearance, as providing other attributes like e.g. the declared
licenses is not (yet) implemented.

Signed-off-by: Frank Viernau <frank_viernau@epam.com>
As the analyzer is responsible for adding all package curations from all
configurated providers to the resolved configuration in the ORT file, do
so as well when generating the analyzer result from a package list file.

This allows to use the standard package curation workflow with
`CreateAnalyzerResultFromPackageListCommand`.

Signed-off-by: Frank Viernau <frank_viernau@epam.com>
@fviernau fviernau force-pushed the helper-create-ort-file-from-simple-definition-file branch from c05897c to 3f8030f Compare September 12, 2023 10:31
@fviernau fviernau merged commit e6a57c0 into main Sep 12, 2023
@fviernau fviernau deleted the helper-create-ort-file-from-simple-definition-file branch September 12, 2023 12:20
@sschuberth sschuberth added new feature Issues that are considered to be new features and removed enhancement Issues that are considered to be enhancements labels Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
helper-cli About the "helper" Command Line Interface new feature Issues that are considered to be new features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants