-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(report): Replacing source_location
in github
report when scanning an image
#5999
Changes from 6 commits
458a30b
054833e
3c2d602
b65ef00
077f52f
dffe603
c43fd98
fc53d37
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -105,8 +105,25 @@ func (w Writer) Write(ctx context.Context, report types.Report) error { | |
manifest.Name = string(result.Type) | ||
// show path for language-specific packages only | ||
if result.Class == types.ClassLangPkg { | ||
manifest.File = &File{ | ||
SrcLocation: result.Target, | ||
if report.ArtifactType == ftypes.ArtifactContainerImage { | ||
// `RepoDigests` ~= <registry>/<image_name>@sha256:<image_hash> | ||
// `RepoTag` ~= <registry>/<image_name>:<image_tag> | ||
// By concatenating the hash from `RepoDigests` at the end of `RepoTag` we get all the information | ||
image_reference := strings.Join(report.Metadata.RepoTags, ", ") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we use "RepoDigests" instead, as that included both the tag and digest? That would address the comment I just put on the PR discussion? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to confirm using Each time you scan your image with a new sha it will close all the previously open alerts for that image and only keep the latest version open. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A mix of suggestion at #5999 (comment) and yours would give us the longest form possible storing all information:
I tested it and although it's very long, filtering seems to work too (the Important One big drawback of using DmitriyLewen@8431664 is that since each vulnerable package has its own There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, showing the package location (while having those disadvantages) will help with investigations about what actually needs to be fixed/updated
DmitriyLewen marked this conversation as resolved.
Show resolved
Hide resolved
|
||
image_with_hash := strings.Join(report.Metadata.RepoDigests, ", ") | ||
_, image_hash, found := strings.Cut(image_with_hash, "@") | ||
if found { | ||
image_reference += "@" + image_hash | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why are you doing this rather than just using RepoDigests directly? Mind adding a comment? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
So if only using Good question, I added a comment. |
||
} | ||
// Replacing `source_location` in manifest by the image name, tag and hash | ||
manifest.File = &File{ | ||
SrcLocation: image_reference, | ||
} | ||
|
||
} else { | ||
manifest.File = &File{ | ||
SrcLocation: result.Target, | ||
} | ||
} | ||
} | ||
|
||
|
@@ -123,6 +140,10 @@ func (w Writer) Write(ctx context.Context, report types.Report) error { | |
return xerrors.Errorf("unable to build purl for %s: %w", pkg.Name, err) | ||
} | ||
|
||
if pkg.FilePath != "" { | ||
githubPkg.Metadata = Metadata{"source_location": pkg.FilePath} | ||
} | ||
|
||
resolved[pkg.Name] = githubPkg | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -164,6 +164,73 @@ func TestWriter_Write(t *testing.T) { | |
}, | ||
}, | ||
}, | ||
{ | ||
name: "pypi", | ||
Maxim-Durand marked this conversation as resolved.
Show resolved
Hide resolved
|
||
report: types.Report{ | ||
SchemaVersion: 2, | ||
ArtifactName: "fake_repo.azurecr.io/image_name", | ||
ArtifactType: "container_image", | ||
Metadata: types.Metadata{ | ||
RepoDigests: []string{"fake_repo.azurecr.io/image_name@sha256:a7c92cdcb3d010f6edeb37ddcdbacab14981aa31e7f1140e0097dc1b8e834c49"}, | ||
RepoTags: []string{"fake_repo.azurecr.io/image_name:latest"}, | ||
}, | ||
Results: types.Results{ | ||
{ | ||
Target: "Python", | ||
Class: "lang-pkgs", | ||
Type: "python-pkg", | ||
Packages: []ftypes.Package{ | ||
{ | ||
Name: "jwcrypto", | ||
Version: "0.7", | ||
Licenses: []string{ | ||
"LGPLv3+", | ||
}, | ||
Layer: ftypes.Layer{ | ||
Digest: "sha256:ddc612ba4e74ea5633a93e19e7c32f61f5f230073b21a070302a61ef5eec5c50", | ||
DiffID: "sha256:12935ef6ce21a266aef8df75d601cebf7e935edd01e9f19fab16ccb78fbb9a5e", | ||
}, | ||
FilePath: "opt/pyenv/versions/3.11.2/lib/python3.11/site-packages/jwcrypto-0.7.dist-info/METADATA", | ||
}, | ||
{ | ||
Name: "matplotlib", | ||
Version: "3.5.3", | ||
Licenses: []string{ | ||
"PSF", | ||
}, | ||
Layer: ftypes.Layer{ | ||
Digest: "sha256:ddc612ba4e74ea5633a93e19e7c32f61f5f230073b21a070302a61ef5eec5c50", | ||
DiffID: "sha256:12935ef6ce21a266aef8df75d601cebf7e935edd01e9f19fab16ccb78fbb9a5e", | ||
}, | ||
FilePath: "opt/pyenv/versions/3.11.2/lib/python3.11/site-packages/matplotlib-3.5.3.dist-info/METADATA", | ||
}, | ||
}, | ||
}, | ||
}, | ||
}, | ||
want: map[string]github.Manifest{ | ||
"Python": { | ||
Name: "python-pkg", | ||
File: &github.File{ | ||
SrcLocation: "fake_repo.azurecr.io/image_name:latest@sha256:a7c92cdcb3d010f6edeb37ddcdbacab14981aa31e7f1140e0097dc1b8e834c49", | ||
}, | ||
Resolved: map[string]github.Package{ | ||
"jwcrypto": { | ||
PackageUrl: "pkg:pypi/[email protected]", | ||
Relationship: "direct", | ||
Scope: "runtime", | ||
Metadata: github.Metadata{"source_location": "opt/pyenv/versions/3.11.2/lib/python3.11/site-packages/jwcrypto-0.7.dist-info/METADATA"}, | ||
}, | ||
"matplotlib": { | ||
PackageUrl: "pkg:pypi/[email protected]", | ||
Relationship: "direct", | ||
Scope: "runtime", | ||
Metadata: github.Metadata{"source_location": "opt/pyenv/versions/3.11.2/lib/python3.11/site-packages/matplotlib-3.5.3.dist-info/METADATA"}, | ||
}, | ||
}, | ||
}, | ||
}, | ||
}, | ||
} | ||
|
||
for _, tt := range tests { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We aggregate some packages to 1 result -
trivy/pkg/fanal/applier/docker.go
Lines 263 to 291 in fb36c4e
But we can find path from Results.Packages.FilePath:
Maybe we just use FilePath for these packages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I agree with you it's an interesting change as it would make the manifest more precise.
Nonetheless, I think the full
FilePath
would be too much to render in Github Dependency and would bloat the UI.I can do a test with that change if you want and show the results here ?
Otherwise we could try to extract important values from the
FilePath
and adding them toresult.Target
, but that seems complex and prone to errors.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For other languages we already use filePath (filePath is stored in
target
):And we didn't get issues about UI.
Yes please. If you have time, check it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I have a few things to report:
As you pointed out trivy aggregates multiple packages in the same "App" (for instance
Python
orNode
) into one result.The issue is that in the sbom format, the
source_location
is set at themanifest
level not at the package level (resolved
key in sbom format). Meaning, if we keep the default trivy aggregation then we have to use only one of all the packagesFilePath
. How to define that value is either arbitrary, or we need some sort of heuristic ?In terms of UI it isn't as bloated as I expected, so I think you're right and it's ok to use.
The
source_location
used in the above screenshot is simply theFilePath
of the last package inresult.Packages
so it's not even correct.In the case trivy is scanning an image I think it's still pretty useless to replace the
manifest
'ssource_location
with thisFilePath
value and we should instead reference the image's name and tag.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have already seen cases where it is necessary to remove aggregation, but this requires large changes and we are not sure about it.
I created a small function to get filePath for each package - DmitriyLewen@8431664
Can you take a look?
This makes sense for UI. But what worries me is case where user only knows that image contains vulnerable package, but doesn't have filePath to find and update that package. It would be great to add filePath somehow.
e.g. we can use
<image_name>:<tag> - <filePath>
format for manifest map key. Is it possible to seegsbom
file after uploading to GitHub?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I can see in Github Dependency UI no, see screenshot below to see how it looks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So user can only get package metadata when taking snapshot, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they use trivy then there's no issue as the user will now be able to find the file path for each package by parsing the JSON report before sending it to GitHub.
But if they use trivy-action then the report is sent automatically and the user doesn't have access to the original report hence he wouldn't be able to parse it.
Maybe if the user then export it from GitHub, the metadata tag with each package file path will still be there, I'll test that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I got from github export to SBOM for the same package as in #5999 (comment):
As you can see the
metadata
tag doesn't show up in the exported SBOM.Important
Conclusion: users of
trivy-action
would currently have no way of finding out thefilePath
of each package when using thegithub
format.As this issue only involves
trivy-action
and nottrivy
itself anymore, I think we should continue this discussion once this PR is merged to find a solution fortrivy-action
users at aquasecurity/trivy-action#286(I would recommend to add an option for upload as an artifact so that user can easily fetch the report).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created aquasecurity/trivy-action#307 to make sure users of
trivy-action
will know where to find the missing details from Github Dependency.@DmitriyLewen and @knqyf263 do you agree with this compromise ?
If so, can you review this PR code ?