Skip to content

Draft for an SBOM matching system interface#953

Draft
oxisto wants to merge 1 commit intooasis-tcs:masterfrom
oxisto:guidance-sbom-matching
Draft

Draft for an SBOM matching system interface#953
oxisto wants to merge 1 commit intooasis-tcs:masterfrom
oxisto:guidance-sbom-matching

Conversation

@oxisto
Copy link
Copy Markdown
Contributor

@oxisto oxisto commented Apr 22, 2025

This PR contains a draft guidance how to possible implement an interface to match SBOMs to CSAF documents, in order to support the CSAF SBOM matching matching system conformance target.

- `SBOM`: A single SBOM
- `SBOMComponent`: A single component in an SBOM

| Name | Properties |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to make these subsections and introduce the triplet with a taxonomy statement.
Or, use a definition list. The table with mini page like documents evades the global ordering assumptions of the reader. As they are these look like pure visual layout, while the naming and description are semantic.

Copy link
Copy Markdown
Contributor

@sthagen sthagen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as a start. I would change consistently the style (language use) to not directly address the reader and to also not describe the interface like a specific implementation (i.e. avoid the "Currently we support ..." style)

This documents serves as a guidance how to implement an interface for a [CSAF SBOM matching matching system](https://docs.oasis-open.org/csaf/csaf/v2.0/os/csaf-v2.0-os.html#9117-conformance-clause-17-csaf-sbom-matching-system).

The matching interface defines several data structures. Its intention is to match the information on a CSAF data structure against a defined SBOM format. Different SBOM formats exist and can be mapped to this specification. The following three data structures need to be defined:
- `SBOMDatabase`: A database of SBOMs
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please separate structurally different elements with an empty line, so that the eyes are sufficient to render the structure for the mind.

@@ -0,0 +1,91 @@
## SBOM Matching Interface

This documents serves as a guidance how to implement an interface for a [CSAF SBOM matching matching system](https://docs.oasis-open.org/csaf/csaf/v2.0/os/csaf-v2.0-os.html#9117-conformance-clause-17-csaf-sbom-matching-system).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us be specific here: "the" instead of "an". The term "for" to me is also ambiguous and I assume it is the interface between SBOM data and CSAF data. Question: Isn't "that" already the "matching system"? Sorry, in case I got that wrong.


Each value between `0.0` and `1.0` can be used to indicate a confidence level. However, it is recommended to use the values defined in the table below or a multiplication of them. For example, if a `CaseInsensitiveMatch` is used in combination with a `DifferentSources`, the resulting confidence level would be `0.95 * 0.90 = 0.855`. The following table defines the known confidence levels and their meaning:

| Short Identifier | Confidence Level | Comment |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe here a definition list would be better adapted? The primary key is the confidence level, and the short identifier is a memory helper / alias, no?

Like:

0.00 (`DefinitelyNoMatch`)
:    Indicates that the component is definitely not matched to the document.

0.50 (PartialStringMatch`)
:    Indicates that a string property (e.g., the vendor name) of the product partially matches the component's string property. 

...


## Matching Properties

Since much different information are available both on the security advisory and the SBOM document or node, we need a way to structure information to correlate between them. For example, the SBOM node might not directly indicate the version of the product, but it might specify a CPE which includes the version. On the other hand, the CSAF advisory might indicate the version and product name of the product, but not a CPE. In this case we want the matching to be able to correlate the information based on the properties (product name and version) contained in different places (CPE, categorized values). If we would just match based on a CPE, we would not be able to generate a match in this example.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe distribute at most one sentence per line to have more accessible line lengths and reduce the noise in later diffs? And reword the first sentence:

To meaningfully combine the different information sets provided by security advisories as compared to SBOM documents or SBOM nodes a mapping is needed that correlates the relevant aspects only.
For example, the SBOM node might not directly indicate the version of the product, but it might specify a CPE which includes the version.
In contrast, the CSAF advisory might indicate the version and product name of the product, but not a CPE.
In this case, the matching should correlate the information based on the properties (product name and version) contained in different places (CPE, categorized values). 
If matching were based on a CPE, no correlation would be possible in this example.

Since much different information are available both on the security advisory and the SBOM document or node, we need a way to structure information to correlate between them. For example, the SBOM node might not directly indicate the version of the product, but it might specify a CPE which includes the version. On the other hand, the CSAF advisory might indicate the version and product name of the product, but not a CPE. In this case we want the matching to be able to correlate the information based on the properties (product name and version) contained in different places (CPE, categorized values). If we would just match based on a CPE, we would not be able to generate a match in this example.

Therefore, we define an (initial) set of properties that can be used to establish a match. In general, it is important to know that a "property" holds a certain set of meta-data:
- `name`: The name of the property
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit.Again, suggest a definition list, as we define terms like so:


`name`
:    The name of the property.
 
`value`
:    The value of the property.
This can be a string or a more specialized structure, such as a product version or identifier.

`source`
:    The source of the property.
Currently, extraction of the property from the CSAF/SBOM structure as well as from CPE and PURL identifiers contained in them are supported.


| Name | Type | Description |
|-------------------|-------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `vendor` | `StringProperty` | The value of the property is a string containing the vendor of the vulnerable product. Possible sources include:<br/>- Branches ([`branches_t`](https://docs.oasis-open.org/csaf/csaf/v2.0/os/csaf-v2.0-os.html#312-branches-type)) in the CSAF advisory whose category is of type [`vendor`](https://docs.oasis-open.org/csaf/csaf/v2.0/os/csaf-v2.0-os.html#3122-branches-type---category)<br/>- The `vendor` property of the CPE specified in the [`product_identification_helper`](https://docs.oasis-open.org/csaf/csaf/v2.0/os/csaf-v2.0-os.html#3133-full-product-name-type---product-identification-helper) as well as of any CPE specified in the SBOM node<br/>- Any suitable property in the SBOM document indicating a vendor |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me the two leftmost columns read like a variabl-type declaration, where in code we would not expect so much text hanging off of the right side.

I can imagine a definition list or dedicated subsections with a nice introductory statemenet would be more readable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants