Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve ambiguous definition of serialNumber #474

Open
mschusterbsi opened this issue Jun 5, 2024 · 5 comments
Open

Resolve ambiguous definition of serialNumber #474

mschusterbsi opened this issue Jun 5, 2024 · 5 comments

Comments

@mschusterbsi
Copy link

mschusterbsi commented Jun 5, 2024

Current Behavior

serialNumber is defined as an UUID and RECOMMENDED:

Every BOM generated SHOULD have a unique serial number, even if the contents of a BOM have not changed over time. If specified, the serial number MUST conform to RFC-4122. Use of serial numbers are RECOMMENDED.

version is defined as an integer > 0:

Whenever an existing BOM is modified, either manually or through automated processes, the version of the BOM SHOULD be incremented by 1. […]
This contradicts the definition of serialNumber, except one interprets these statements, as both fields have to be changed when an extant SBOM is newly generated.

Proposed Behavior

In our opinion UUIDs and hence the CycloneDX serialNumber must be static ("unequivocal in time and space" = "temporally and spatially unique"), as long as an SBOM creator records the same software component, even if these software componets are altered: e.g. new versions, files or sub-components are added or removed, etc.

Hence, we propose as the definition of serialNumber:
Every BOM creator SHOULD use a unique serial number when describing a specific component, which MUST stay the same if the BOM is re-generated or the contents of this component have changed. If specified, the serial number MUST conform to RFC-4122. Use of serial numbers is RECOMMENDED.

@jkowalleck
Copy link
Member

related: #363

@jkowalleck
Copy link
Member

related: #97

@stevespringett
Copy link
Member

The proposed description has some issues

Every BOM creator SHOULD use a unique serial number when describing a specific component, which MUST stay the same if the BOM is re-generated or the contents of this component have changed. If specified, the serial number MUST conform to RFC-4122. Use of serial numbers is RECOMMENDED.

Specifically with:

which MUST stay the same if the BOM is re-generated or the contents of this component have changed

This would require the BOM creator to maintain a database of all the components (first-party and third-party) and ensure they reuse the same serialNumber. This requirement could not be fulfilled by the majority of existing BOM generators, especially those integrated into CI/CD pipelines.

Is the goal of this change to make the serialNumber deterministic?

@mschusterbsi
Copy link
Author

We understand that there might be technical hurdles to reuse an unique identifier. Though, we are convinced that these can be overcome in most cases.

Our aim is to be make sure that an SBOM creator uses the same serial number (or some other unique identifier for a specific SBOM) for the same primary component. The version field would be incremented for each newly created version. This allows the consumer to correlate different versions of an SBOM from the same SBOM-creator and detect changes between them.

At least we imagine the following wording, as an weaker alternative to our original suggestion (though we are not really happy with it):
Every BOM creator SHOULD use a unique serial number when describing a specific component, which SHOULD stay the same if the BOM is re-generated or the contents of this component have changed. If specified, the serial number MUST conform to RFC-4122. Use of serial numbers is RECOMMENDED.

@fvsamson
Copy link

fvsamson commented Jun 25, 2024

When reading this discussion thread, I think I perceive a few subtle misunderstandings:

  • The proposal states "Every BOM creator SHOULD use a unique serial number when describing a specific component, …", but more concisely that means "… when generating a specific BOM (e.g. for a certain component), …" IMO.

  • Is the goal of this change to make the serialNumber deterministic?

    I do not comprehend "deterministic" in this context; "uniquely identifying a specific BOM which describes a certain primary component (WRT this BOM)" and "is likely generated by the same a BOM creator" appears to be the goal here. Simply what UUIDs (Universal Unique IDentifiers) have been invented for: To be able to re-identify the same object, even across non-substantial changes, as version is used to denote these changes.

  • This would require the BOM creator to maintain a database of all the components (first-party and third-party) and ensure they reuse the same serialNumber.

    This statement made me consider the wording of the proposal as ambiguous (as denoted in the first bullet point of this message), because the point is primarily about BOMs, which for sure always have a primary component a BOM principally describes. For example, imagine a BOM for the RPM package postgres (the DBMS) packaged by RedHat: Defining serialNumber as proposed here would allow to unambiguously identify "postgres by RedHat". Would this require RedHat to maintain a database of all the components postgres uses (which they sure have)? IMO not, all this information can be accessed at build time. This scheme only requires to maintain a database of the UUIDs used for each primary component a BOM creator generates BOMs for.

  • This requirement could not be fulfilled by the majority of existing BOM generators, especially those integrated into CI/CD pipelines.

    I do not dare to comment on "the majority of existing BOM generators", because I surely do not know them all, but I see no reason why this scheme would not work with "BOM generators […] integrated into CI/CD pipelines": A CI build recipe is fed with information (usually a git repo and a git tag to check out, plus some ancillary information, which could comprise a UUID / serialNumber to reuse, or the UUID / serialNumber is simply stored in the git repo of the primary component) and outputs principally the build artifact(s), but may also output additional information, which could comprise a UUID / serialNumber it used, e.g. as part of an generated SBOM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants