Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADR 44: describing support of SDPX SBOM format in konflux #213

Merged
merged 16 commits into from
Dec 2, 2024
Merged
162 changes: 162 additions & 0 deletions ADR/0044-spdx-support.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
# 44. SPDX SBOM support

Date: 2024-09-24

## Status

Proposed
midnightercz marked this conversation as resolved.
Show resolved Hide resolved

## Glossary
* SBOM - Software Bill of Materials
* SPDX - Software Package Data Exchange
* PURL - Package URL
* Builder images - Images used in FROM instructions in the `Dockerfile`

## Context

SPDX SBOM format enables additional features not available in cyclondedx like multiple purl attributes per component. SPDX is also a widely adopted standard for software bill of materials.
This ADR describes how to enable use of SPDX SBOM format in Konflux.

## Decision

### SBOM lifecycle in build pipeline

At the start SBOMs are generated by [cachi2](#references) and [syft](#references). These two SBOM files are merged together into single SBOM document. At later phase of the build pipeline, builder images of the currently build image are added into SBOM as build dependency of the image. To switch to SPDX format, all tools producing and processing SBOMS in the pipeline has to be able to work with SPDX format. SBOMS of builder images are not processed by the pipeline, therefore builder images SBOMs doesn't have to be in SPDX format. This leads to fact that when tools generating the SBOMs are switched to SPDX format, all tools processing SBOMS can expect SPDX format only. There's no need for any tool to be able to work with mixed inputs of SPDX and CycloneDX formats
As a result, all tools and tasks should implement the sbomType attribute to specify the expected SBOM format for input and output. This will allow tools to be tested with SPDX before the entire pipeline transitions to this format.
midnightercz marked this conversation as resolved.
Show resolved Hide resolved

### CycloneDX -> SPDX conversion

CycloneDX (1.4) is structured document in json format with following structure (not full specification)
midnightercz marked this conversation as resolved.
Show resolved Hide resolved

- Document Root
- Metadata
- Tools
- `List<Tool>`
- vendor
- name
midnightercz marked this conversation as resolved.
Show resolved Hide resolved
- Components
- `List<Component>`
- name
- version
- purl
- properties
- `List<Property>`
- name
- value
- formulations
- `List<Component>`
midnightercz marked this conversation as resolved.
Show resolved Hide resolved

SPDX (2.3) is structured document in json format with following structure(not full specification):
- Document Root
- name
- SPDXID
- creationInfo
- Creators
- `List<String>`
- packages
- `List<Packages>`
- SPDXID
- name
- versionInfo
- externalRefs
- `List<ExternalRef>`
- referenceCategory
- referenceType
- referenceLocator
- annotations
- `List<Annotation>`
- annotationDate
- annotationType
- annotator
- Comment
- relationships
- `List<Relationship>`
- spdxElementId
- relationshipType
- relatedSpdxElement
midnightercz marked this conversation as resolved.
Show resolved Hide resolved

#### 1:1 conversions
Following CycloneDX to SPDX attributes are converted as 1:1 as they represent the same thing.

| CycloneDX Attribute | SPDX Attribute |
|----------------------------|---------------------|
| components | packages |
| component.name | package.name |
| component.version | package.versionInfo |


#### Component.purl
CycloneDX (version 1.4) supports only a single purl attribute per component. SPDX doesn’t have a direct attribute, but instead every package includes an externalRefs array which describes all external references for the package. There are defined reference categories and types. For PURL, category PACKAGE-MANAGER and type purl is used. The purl itself will be stored as referenceLocator

| CycloneDX Attribute | SPDX Attribute |
|------------------------------|---------------------------------------------------------------|
| component.purl = `<PURL>` | package.externalRefs = [{referenceCategory:”PACKAGE-MANAGER”, |
| | referenceType:purl, |
| | referenceLocator: `<PURL>` |
| | }] |
Comment on lines +96 to +101
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you fix the formatting in this table and the tables below? It's not readable in the rendered doc https://github.com/midnightercz/konflux-ci-architecture/blob/spdx-support/ADR/0044-spdx-support.md#componentpurl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this was fixed. All tables look correct to me

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loss of indentation hurts readability, especially for this one

image

How about changing them from Markdown tables to monospace blocks? Or just wrapping the tables in ``` so that they look the same as in the source code



#### Component.properties
CycloneDX components properties describe mapping of string:string properties for given component. SPDX component doesn’t have anything similar to cyclonedx properties. SPDX Package annotations are the only attribute where custom data can be stored and the only “customizable” field where there is comment which is a simple string. Due to that fact, cycloneDX property in format of {“name”: <string>, “value”: <string>} is encoded into json string. There can be also annotations produced by other tools. Therefore to be able to tell annotation comment is json encoded, annotator should ends with string “:jsonencoded”

| CycloneDX Attribute | SPDX Attribute |
|-------------------------------------------|---------------------------------------------|
| components.properties = [ | package.annotations = [ |
| {“name”: …, “value”: …} | {..., annotator: "`<tool>`:jsonencoded” |
| ] | ] |


#### Formulations
CycloneDX formulations describe how the container was manufactured. In SPDX, Relationship elements can be used for the same purpose. All elements in SPDX have SPDXID attribute which is an element identifier unique in the whole SBOM document. Relationship element describes relation between two elements using their SPDXID and relationship type. Relationship type BUILD_TOOL_OF can be used to express the relationship of packages which were used to build the container.

| CycloneDX Attribute | SPDX Attribute |
|---------------------------------|------------------------------------------------------------|
| Formulations.components = [{}] | Relationships = [ |
| | { |
| | spdxElementId = `<ROOT-DOCUMENT-ID>`, |
| | relationshipType=DESCRIBES, |
| | relatedSpdxElement=`<CONTAINER-IMAGE-ID>`, |
| | }, |
midnightercz marked this conversation as resolved.
Show resolved Hide resolved
| | { |
| | spdxElementId = `<A-BUILDER-IMAGE-ID>`, |
| | relationshipType=BUILD_TOOL_OF, |
| | relatedSpdxElement=`<CONTAINER-IMAGE-ID>` |
| | } |
| | ] |

*Explanation: Root document `DESCRIBES` `CONTAINER-IMAGE-ID` element which represents the container itself. `BUILDER-IMAGE-ID` represents the builder image which was used to build the container. The relationship type `BUILD_TOOL_OF` is used to express that the builder image was used to build the container image.*

#### Metadata.tools
The CycloneDX metadata.tools sub attributes that we are mostly interested in are the vendor and name elements. Information about the creation of the SPDX document can be stored into creationInfo. CreationInfo.creators element is basically a list of strings. There’s a vague specification ([here](https://spdx.github.io/spdx-spec/v2.3/document-creation-information/#68-creator-field)]) about how it should be structured in the standard. Strings should be formatted in the following way: `<Attribute>: <Value>`. For example vendor should be stored as `Vendor: <vendor>`

| CyloneDX Attribute | SPDX Attribute |
|------------------------------------------------|---------------------------------------------------|
| Metadata.tools = [{“vendor”: “X”, “name”: “Y”] | CreationInfo.creators = [“Vendor: X”, “Tool: Y”] |
midnightercz marked this conversation as resolved.
Show resolved Hide resolved

#### Merging SPDX
##### Packages
Packages of two SPDX documents can be merged together as a concatenation of two lists. In cycloneDX component elements can have only a single purl attribute, therefore component elements representing packages with the same name and version but with different purl have to be stored as multiple elements. SPDX package elements can bear multiple purls. Therefore multiple cycloneDX components can be squashed together into single SPDX package element with purls concatenated into a single list. Following rules are applied to packages merging process:
- Packages with the same purl's package name and version and type are squashed into single package element
Copy link
Contributor

@chmeliik chmeliik Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar concern to #213 (comment) - this shouldn't be the authoritative decision on how to merge packages. Not all tools should do it this way, cachi2 certainly shouldn't

Copy link
Contributor

@chmeliik chmeliik Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can just re-word this to make it clear that this is a sort of default suggestion for how to do the merging if you don't have more specific requirements

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, good point


*NOTE: packages cannot be merged together based on SPDXID attribute as there’s no specification in the SPDX standard on how SPDXID should be calculated. Individual tools can calculate it differently while still passing condition to make it unique across the whole document.*

##### Relationships
SPDX relationships represent graph/tree structure of relations of elements in the document. The Root element is the SPDX document itself (with SPDXID SPDXRef-Document).
SPDX Root document typically contains a package representing source used for generating the SBOM. This can be container image, directory, etc. Root document is in relationship DESCRIBES with this source package. Other packages are in specific relationships with the source package. See also [syft specific sbom details](#syft-specific-sbom-details)

Relations of two documents needs to be merged together into single graph in a way which keeps the graph structure of the original graph of the main document (into which other document will be merged to). Once packages are merged together, relationships of the second document must be cleared off relations which refer to packages not included in the merged package list. SpdxElementId and relatedSpdxElement point to root document id of the second document should be replaced with root document id of the main document. Source package element id in the second documents needs to be replaced with source package element id of the main document.
midnightercz marked this conversation as resolved.
Show resolved Hide resolved

## Syft specific sbom details
### Virtual packages
Syft generates "source package" representing source used to generate the sbom document. For example when sbom is generated by command `syft scan dir:<dir>`, the package
with SPDXID `SPDXRef-DocumentRoot-Directory-<dir>-` is generated. Such package has name se to `<dir>`, no versionInfo and no attributes. In relationships this package is in then in relation `SPDXRef-DOCUMENT` `DESCRIBES` `SPDXRef-DocumentRoot-Directory-<dir>-` and then all packages are in relation CONTAINS with this virtual package, e.i. `openshift4----ose-cluster-update-keys` `<RELATIONSHIP-TYPE>` `Package-A`.

## Consequences
All tooling used in pipeline needs to support SPDX SBOM format

## References
* [CycloneDX specification](https://cyclonedx.org/specification/overview/)
* [SPDX specification](https://spdx.github.io/spdx-spec/v2.3/)
* [SPDX json schema](https://github.com/spdx/spdx-spec/blob/development/v2.3/schemas/spdx-schema.json)
* [cachi2](https://github.com/containerbuildsystem/cachi2/)
* [syft](https://github.com/anchore/syft)