Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema validation fails with git style URL #890

Open
Recurse-blip opened this issue Jun 28, 2024 · 6 comments
Open

Schema validation fails with git style URL #890

Recurse-blip opened this issue Jun 28, 2024 · 6 comments
Labels
enhancement New feature or request ready for development Issue is sufficiently defined and suitable for contributors to start working

Comments

@Recurse-blip
Copy link

It seems that the CycloneDX tools generates invalid URL when generating the SBOM which fails the schema validation when trying to upload the BOM to dependency-track.

This is the error I get :

{
    "status": 400,
    "title": "The uploaded BOM is invalid",
    "detail": "Schema validation failed",
    "errors": [
        "cvc-datatype-valid.1.2.1: '[email protected]:LordVeovis/xmlrpc.git' is not a valid value for 'anyURI'.",
        "cvc-type.3.1.3: The value '[email protected]:LordVeovis/xmlrpc.git' of element 'url' is not valid."
    ]
}

I think CycloneDX should convert those git style references to something like git+ssh://... or git+http://....git which are valid URLs.

Related issue :
DependencyTrack/dependency-track#3885
CycloneDX/cyclonedx-node-npm#1198

@github-actions github-actions bot added the triage Don't know what to do with this yet label Jun 28, 2024
@Recurse-blip Recurse-blip changed the title Schema validation fails with external git references Schema validation fails with git style URL Jun 28, 2024
@mtsfoni
Copy link
Contributor

mtsfoni commented Jun 28, 2024

I would suspect those values are not generated by the tool but read from a source.

Where exactly are those invalid uri's in your sbom? Can you provide me steps to reproduce?

@mtsfoni mtsfoni added the question Further information is requested label Jun 28, 2024
@Recurse-blip
Copy link
Author

@mtsfoni I will provide a test project to reproduce

@Recurse-blip
Copy link
Author

I would suspect those values are not generated by the tool but read from a source.

Where exactly are those invalid uri's in your sbom? Can you provide me steps to reproduce?

See the bom.xml generated here :

https://github.com/Recurse-blip/cyclonedx_giturl/actions/runs/9714725731/job/26814458493

You will see that there is an URL with this content :

[email protected]:LordVeovis/xmlrpc.git

It should be converted to a valid URL such as git+http://github.com/LordVeovis/xmlrpc.git

@mtsfoni
Copy link
Contributor

mtsfoni commented Jun 28, 2024

The source of the problem is obviously here: https://github.com/LordVeovis/xmlrpc/blob/2f6fc86d85d0eab0f26a73ba9e2a1d0cc9be26f7/Kveer.XmlRPC/Kveer.XmlRPC.csproj#L19

Even if rubbish comes in, this tool should still generate a valid cyclonedx file.

I think we should add a check when we fill URLs if they are valid. If it isn't, we could probably delete it (easy solution). Alternatively, somebody could build a system that reliably replaces those, but that adds more complexity.

@mtsfoni mtsfoni added enhancement New feature or request ready for development Issue is sufficiently defined and suitable for contributors to start working and removed question Further information is requested triage Don't know what to do with this yet labels Jun 28, 2024
@jkowalleck
Copy link
Member

jkowalleck commented Jul 4, 2024

I think we should add a check when we fill URLs if they are valid. If it isn't, we could probably delete it

doing the same for XML in PHP https://github.com/CycloneDX/cyclonedx-php-library/blob/fab6f93979fc43cb64d0d15d086a565e2b7072d2/src/Core/_helpers/XML.php#L62-L76

anyway, for field where you know it could be a git-ssh address - like externalReference of type VCS, you should not throw the data away, but transform it accordingly.
a string in the format of <user>@<host>:<path> can be converted to git+ssh://<user>@<host>/<path> according to all specs.

@Alex-Stevens
Copy link

Hi,

I gave writing an implementation for this a go.

I started by looking at what was done for the node generator and what the Git docs said about the SCP-style format.

The node implementation's regex approach didn't look like it handled some cases, but in all fairness, I didn't try it out, so it might be fine. I also checked what sbom-utility did to validate URLs, and it turns out, it ultimately just passes the value to Go's net/url library to see if it can parse it. So, I thought I'd try the same approach with .NET's URI class.

This works reasonably well, but once the SCP-URL is parsed, the question is what to transform it into?

I've not seen the "git+ssh" or "git+http" schemes before, but I'm guessing this is an attempt to namespace the scheme. Regardless, whilst transforming the scheme to that makes the URI, and thus the SBOM valid, using that value to try to clone the repo does not work. This renders programmatic use of the value harder. Transforming it to plain https worked well with the problematic GitHub examples I had. I can't speak for other examples from elsewhere.

To sum-up, I think the kernel of the issue is deciding what behaviour makes the most sense. IMO, I think we should drop the value / field rather than spend lots of effort trying to clean-up the URL. I also think a degree of consistency between the different "official" implementations would be helpful.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request ready for development Issue is sufficiently defined and suitable for contributors to start working
Projects
None yet
Development

No branches or pull requests

4 participants