Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

yarn SBOM generation plugin #13

Merged
merged 35 commits into from
Mar 3, 2024
Merged

Conversation

AugustusKling
Copy link
Contributor

This contains a working draft of a Yarn plugin to generate SBOMs. Please see README.md for usage instructions.

Note that I'm not too familiar with the CycloneDX file format and things might be off even though validators pass on the generated files.

Open things:

  • What are sensible defaults for the parameters?
  • Decision on license. Am I correct to assume the contained components can be used as part of a Apache-2.0 licensed project? Are you aware of any tool that takes a list of SPDX identifiers for dependencies (or SBOM) and answers which licenses could be chosen?
  • Treatment of the root component in the SBOM. Should it be put in the metadata only or also within the components property?
  • Many files in the root folder are basically empty templates.
  • Do you have sample projects with known expected output regarding listed components and licenses?
  • Fallback for component licenses.
  • How would one calculate the hash for https://cyclonedx.org/docs/1.5/json/#components_items_hashes for a folder representing a component?
  • Which are the important/desired properties in the generated SBOM files besides those marked as required in the JSON schema?
  • Is the reproducible results flag mentioned in [PLANNING] VISION #8 important? Shouldn't this be a task for a SBOM consumer by masking out properties?
  • I've only used this plugin with Yarn 4.1.0. It may or may not work with other Yarn versions.

Please play around with the draft plugin and give feedback.

@AugustusKling AugustusKling requested a review from a team as a code owner February 20, 2024 22:17
@@ -0,0 +1,75 @@
/**
* This script is intended to be used with this repository only.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not part of the compiled Yarn plugin bundle. It only serves to generate bundles/components-licenses.md

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note for self: revisit before first beta-release happens

sources/sbom.ts Outdated Show resolved Hide resolved
sources/sbom.ts Outdated
)
);
}
if (manifest.license && !manifest.license.includes("SEE LICENSE")) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would one refer to a license that is only found in file like SEE LICENSE whatever.txt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add no special handing for this.
if the license text is "SEE LICENSE whatever.txt" then I'd also expect the pkgInfo.licenseFileContent to point to that file - which would cause your implementation to include the text.

packageRoot: PortablePath,
packageFs: FakeFS<PortablePath>
): string {
const files = packageFs.readdirSync(packageRoot).filter((f) => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling packageFs.readdirSync(...) feels wrong but appears to work. I couldn't find API documentation for Yarn to clarify how their API should be used.

@jkowalleck jkowalleck marked this pull request as draft February 20, 2024 22:39
@jkowalleck jkowalleck changed the title Draft for yarn SBOM generation plugin. yarn SBOM generation plugin Feb 20, 2024
@jkowalleck
Copy link
Member

thank you for the PR. I see some open questions from your side.

we could discuss those in here: #14
I've transferred your current questions over there, and started answering some, wil answer the others soon.
Feel free to add more questions, or respond to my answers.

Copy link
Member

@jkowalleck jkowalleck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did a quick(incomplete) review, but most importantly:
some answers to your questions.

👍 please continue your work, it is very much appreciated.

sources/index.ts Outdated Show resolved Hide resolved
sources/index.ts Show resolved Hide resolved
sources/sbom.ts Outdated Show resolved Hide resolved
.gitignore Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
(license instanceof CDX.Models.NamedLicense ||
license instanceof CDX.Models.SpdxLicense)
) {
license.text = new CDX.Models.Attachment(pkgInfo.licenseFileContent);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove the whole license gathering process, for now. Tis feature should not be part of a minimum viable product.
It might be introduced in a feature development in the future.

Gathering license information is the area where lawyers act strange.
Better have an incomplete SBOM than a wrong SBOM.

gathering license texts is a feature that should be enabled by a CLI switch. Per default, no license texts should be gathered, as this causes SBOM bloat.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to have license info available in the SBOMs as I intend to use it to verify licenses of transitve dependencies (in a bunch of non-public projects).

The other use is the generation of ./bundles/components-licenses.md which is required per the dependecies of this Yarn plugin as long as a bundle is provided.

Adding the license into is no longer a default thing, though. In the meantime you need to enable it with --licenses if desired.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no worries, the feature is not cut entirely, but i can see that you are missing a lot of the aspects to to this feature. i dont want to go into details, as this shall be discussed in a feature request in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this being said, please remove the license gathering so we can have a first MVP soon,
instead of fiddling around with license gathereing.

sources/sbom.ts Outdated Show resolved Hide resolved
sources/sbom.ts Outdated Show resolved Hide resolved
| typeof CDX.Spec.Spec1dot3
| typeof CDX.Spec.Spec1dot4
| typeof CDX.Spec.Spec1dot5;
function spec(specVersion: CDX.Spec.Version): SupportedSpec {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?
Could use CDX.Spec.SpecVersionDict instead? see https://github.com/CycloneDX/cyclonedx-javascript-library/blob/main/src/spec/consts.ts#L308

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CDX.Spec.SpecVersionDict has its key type defined as CDX.Spec.Version which has more values than contained in the dictionary. One could extract the spec values like the following but to me listing the supported values is easier to read and gives a more precise type. Either way would work, though.

type ValueOf<T> = T[keyof T];
type SupportedSpec = ValueOf<typeof CDX.Spec.SpecVersionDict>;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CDX.Spec.SpecVersionDict has its key type defined as CDX.Spec.Version which has more values than contained in the dictionary.

This would be unexpected. Which values does the enum have, that the dict does not have? Would you pull-request a fix for the library, so that this is no longer an issue for the usage here?

Copy link
Contributor Author

@AugustusKling AugustusKling Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

export const SpecVersionDict: Readonly<Partial<Record<Version, Readonly<_SpecProtocol>>>> = Object.freeze({
  [Version.v1dot5]: Spec1dot5,
  [Version.v1dot4]: Spec1dot4,
  [Version.v1dot3]: Spec1dot3,
  [Version.v1dot2]: Spec1dot2
  // <= v1.1 is not implemented
})

And the Version enum

export enum Version {
  v1dot5 = '1.5',
  v1dot4 = '1.4',
  v1dot3 = '1.3',
  v1dot2 = '1.2',
  v1dot1 = '1.1',
  v1dot0 = '1.0',
}

The versions 1.0 and 1.1 are not present in the dictionary nor are they available as protocol instances in CDX.Spec. But as the dictionary has its key type defined as Version the non-availability of versions 1.0 and 1.1 is lost.

Copy link
Member

@jkowalleck jkowalleck Feb 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is intentional.

// !!! partial
export const SpecVersionDict_1: Readonly<Partial<Record<Version, Readonly<_SpecProtocol>>>> = Object.freeze({
  [Version.v1dot5]: Spec1dot5,
  [Version.v1dot4]: Spec1dot4,
  [Version.v1dot3]: Spec1dot3,
  [Version.v1dot2]: Spec1dot2
  // <= v1.1 is not implemented
})

// !! no longer partial
export const SpecVersionDict_2: Readonly<Record<Version, Readonly<_SpecProtocol>>> = Object.freeze({
  [Version.v1dot5]: Spec1dot5,
  [Version.v1dot4]: Spec1dot4,
  [Version.v1dot3]: Spec1dot3,
  [Version.v1dot2]: Spec1dot2,
  // <= v1.1 is not implemented
  [Version.v1dot1]: undefined,
  [Version.v1dot0]: undefined
})

s1 = SpecVersionDict_1[Version.v1dot1]
s2 = SpecVersionDict_2[Version.v1dot1]

I'd expect s1 == undefined and s2 == undefined.
Is this not the case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting the undefined value for the 2 mentioned cases happens as expected. I've now used a type based on CDX.Spec.SpecVersionDict because it does not matter for my current use which specifications are supported.

sources/sbom.ts Outdated Show resolved Hide resolved
@jkowalleck
Copy link
Member

jkowalleck commented Feb 21, 2024

@AugustusKling in order to use your work, i need you to sign the DCO.
implications: https://developercertificate.org/

see a how to here: https://github.com/CycloneDX/cyclonedx-node-yarn/pull/13/checks?check_run_id=21791053467

@AugustusKling
Copy link
Contributor Author

@jkowalleck thank you very much for your feedback so far. It's much appreciated to get your prespective on this.

package.json Outdated Show resolved Hide resolved
.vscode/extensions.json Outdated Show resolved Hide resolved
.yarn/sdks/integrations.yml Outdated Show resolved Hide resolved
@@ -0,0 +1,75 @@
/**
* This script is intended to be used with this repository only.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note for self: revisit before first beta-release happens

package.json Outdated Show resolved Hide resolved
package.json Outdated Show resolved Hide resolved
package.json Outdated
"xmlbuilder2": "^3.1.1"
},
"devDependencies": {
"@types/node": "^20.0.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the types should match the used version of typescript.
better go with a tynamic pointer:

-     "@types/node": "^20.0.0",
+     "@types/node": "ts5.3", 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have have a link to any documentation that explains the ts-version-tags? I had not encountered them before and would assume only the runtime environment's API (Node.js) would be relevant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is just the recommended way to keep your TS in sync with the latest compatible typeings.

},
"devDependencies": {
"@types/node": "^20.0.0",
"@yarnpkg/builder": "^4.0.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the builder is pretty essential for this matter, so please have it set to a specific version.

later, we will have a tool in place that takes care of possible version bumps, and the tests will assure quality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lockfile yarn.lock has the resolved versions and ensures whover executes the yarn install command gets the exact same result. In fact lockfiles that do actually work was our reason to switch from NPM to Yarn a few years ago and we haven't looked back since.

Update process is to call any of Yarn's commands to update dependencies like yarn up or yarn upgrade-interactive, then verify by executing the tests that everything works.

I'd argue that the version of the builder is not important. If the version picked by the update commands generates a working plugin (meaning tests pass), the picked version is okay.

package.json Outdated Show resolved Hide resolved
outputFile: PortablePath | typeof stdOutOutput;
componentType: CDX.Enums.ComponentType;
/** If component licenses shall be included. */
licenses: boolean;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the license gathering is probably not a output option.
either the interface name is not suitable, or the option shall be moved.

sources/sbom.ts Outdated Show resolved Hide resolved
sources/sbom.ts Outdated Show resolved Hide resolved
sources/sbom.ts Outdated Show resolved Hide resolved
sources/sbom.ts Outdated Show resolved Hide resolved
sources/sbom.ts Outdated Show resolved Hide resolved
@jkowalleck jkowalleck requested review from jkowalleck and a team February 27, 2024 20:04
@jkowalleck
Copy link
Member

your work looks very promising, @AugustusKling .

Before we can move forward, I need you to do the following:
Please sign off your commits, to show that you agree to publish your changes under the current terms and licenses of the project , and to indicate agreement with Developer Certificate of Origin (DCO).

An explanation how to sign off after commits were already pushed can always be found in the pull request's check "DCO" details.
For example here: https://github.com/CycloneDX/cyclonedx-node-yarn/pull/13/checks?check_run_id=22140795320

AugustusKling and others added 18 commits March 1, 2024 17:05
Signed-off-by: Augustus Kling <[email protected]>
Signed-off-by: Augustus Kling <[email protected]>
Signed-off-by: Augustus Kling <[email protected]>
Signed-off-by: Augustus Kling <[email protected]>
Signed-off-by: Augustus Kling <[email protected]>
@AugustusKling
Copy link
Contributor Author

@jkowalleck, I had not done the sign-off yet because I feared the rebase to do so would create a mess with the comments in this pull request. It seems through, that rebasing did not create much trouble and edited the commit messages to put the desired sign-off line.

@jkowalleck
Copy link
Member

jkowalleck commented Mar 2, 2024

@AugustusKling would you be okay if I merged the current states?
It all looks good, and I would like to continue development from the current state.

My next steps would be to set up some QA/CT pipelines, polish or modify some features here and there, revisit the test cases, enhance the documentation, and write additional feature requests (requirements engineering) for further development. The usual maintainer's work and organizational this-and-that.
Until the licensing question for the release publication is finally clarified with OWASP (in progress), there are still some things to work on. (PS: I already have a plan how to do interim from-source releases without any licensing questions)

If you wanted to continue contribution, I could ping you as soon as new updates come along.
Is this something you would be interested in?

@AugustusKling AugustusKling marked this pull request as ready for review March 3, 2024 12:58
@AugustusKling
Copy link
Contributor Author

@AugustusKling would you be okay if I merged the current states? It all looks good, and I would like to continue development from the current state.

@jkowalleck, I wanted to make the initial build simpler before merging which I did earlier today. Feel free to go ahead with the merge, now.

If you wanted to continue contribution, I could ping you as soon as new updates come along. Is this something you would be interested in?

Gladly, please inform me about your plans.

I wanted to take the generated SBOM to verify only desired licenses are present in projects, including its transitive dependencies. It seems the generated SBOM is good enough for this purpose. The packages for which the code fails to detect the license (as in license information is absent from the SBOM) can be handled manually as not so many affected packages remain. Not in the projects I have tested with anyway.

The other thing I have in mind is creating a tool which takes the SBOM files, looks for undesired packages in the dependeny tree, and reports these. Undesired is a flexible term here as the criteria would depend on the user. What I have in mind is scanning for packages with known vulnerabilities, packages with incompatible licenses, packages which are not updated and those where their release date is not in a given time range.

These uses would just consume the SBOM but exist outwith the scope of this Yarn plugin.

@jkowalleck jkowalleck merged commit 05a93c1 into CycloneDX:1.0-dev Mar 3, 2024
3 checks passed
@jkowalleck
Copy link
Member

I wanted to take the generated SBOM to verify only desired licenses are present in projects, including its transitive dependencies. [...]
[...]
The other thing I have in mind is creating a tool which takes the SBOM files, looks for undesired packages in the dependeny tree, and reports these. [...] scanning for packages with known vulnerabilities, packages with incompatible licenses, packages which are not updated and those where their release date is not in a given time range.

For example, CycloneDX' sister project - Dependency Track, can ingest an SBOM and do all the things you mentioned.
You could define several profiles to prevent and detect unwanted packages, licenses, and so on.
And it tells about outdated and vulnerable packages.

@jkowalleck
Copy link
Member

re: #13 (comment)

Gladly, please inform me about your plans.

will do, @AugustusKling 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants