Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dpkg collector should attempt parsing /var/lib/dpkg/info/*.list files #865

Closed
errordeveloper opened this issue Mar 4, 2022 · 7 comments
Closed
Labels
ecosystem:os relating to an OS packaging ecosystem enhancement New feature or request

Comments

@errordeveloper
Copy link

errordeveloper commented Mar 4, 2022

What would you like to be added:

Currently syft parses only .md5sums and .conffiles. There are also .list files which appear to track more files.

Here is an example:

$ docker run -ti ubuntu:impish cat /var/lib/dpkg/info/base-files.md5sums 
18f746c5c90ff506445f615ae44f5a7b  lib/systemd/system/motd-news.service
79b82d315319fdb55ccf0f2b4c83fe62  lib/systemd/system/motd-news.timer
34a1699ffecaec1086587f266075f95b  usr/bin/locale-check
75176163fdc8090948d909cc7e4ab4bf  usr/lib/os-release
cf277664b1771217d7006acdea006db1  usr/share/base-files/dot.bashrc
d68ce7c7d7d2bb7d48aeb2f137b828e4  usr/share/base-files/dot.profile
6db82730e03aaeeecb8fee76b73d96d4  usr/share/base-files/dot.profile.md5sums
e8217ac03d4101a8afb94cb0d82c4da4  usr/share/base-files/info.dir
9830e3dbb6a828f2cc824db8db0ceaf7  usr/share/base-files/motd
7028a4bd02c1d276aa45f0973b85fe1b  usr/share/base-files/networks
9926b56bc6e576d4ad206dd82d38deff  usr/share/base-files/profile
7146d00681e32d626ac96f70bd38e108  usr/share/base-files/profile.md5sums
f3b332b9a376a0567236f54d7d87f85e  usr/share/base-files/staff-group-for-usr-local
3b83ef96387f14655fc854ddc3c6bd57  usr/share/common-licenses/Apache-2.0
f921793d03cc6d63ec4b15e9be8fd3f8  usr/share/common-licenses/Artistic
3775480a712fc46a69647678acb234cb  usr/share/common-licenses/BSD
65d3616852dbf7b1a6d4b53b00626032  usr/share/common-licenses/CC0-1.0
24ea4c7092233849b4394699333b5c56  usr/share/common-licenses/GFDL-1.2
10b9de612d532fdeeb7fe8fcd1435cc6  usr/share/common-licenses/GFDL-1.3
5b122a36d0f6dc55279a0ebc69f3c60b  usr/share/common-licenses/GPL-1
b234ee4d69f5fce4486a80fdaf4a4263  usr/share/common-licenses/GPL-2
d32239bcb673463ab874e80d47fae504  usr/share/common-licenses/GPL-3
5f30f0716dfdd0d91eb439ebec522ec2  usr/share/common-licenses/LGPL-2
4fbd65380cdd255951079008b364516c  usr/share/common-licenses/LGPL-2.1
e6a600fd5e1d9cbde2d983680233ad02  usr/share/common-licenses/LGPL-3
0c5913925d40b124fb52ce84c5deb3f3  usr/share/common-licenses/MPL-1.1
815ca599c9df247a0c7f619bab123dad  usr/share/common-licenses/MPL-2.0
57e7e94034b0d1220f7b7fd4682c8a94  usr/share/doc/base-files/README
fbd937e067f0a83fb9422713a6b84a8a  usr/share/doc/base-files/README.FHS
f343e1edb724b5a85de2e03974b1fa6f  usr/share/doc/base-files/changelog.gz
c686090b1ff44554e2d6c541280a55b1  usr/share/doc/base-files/copyright
07223424da25c119e376b74f04a1cb2b  usr/share/lintian/overrides/base-files
$ docker run -ti ubuntu:impish cat /var/lib/dpkg/info/base-files.list
/.
/bin
/boot
/dev
/etc
/etc/debian_version
/etc/default
/etc/dpkg
/etc/dpkg/origins
/etc/dpkg/origins/debian
/etc/dpkg/origins/ubuntu
/etc/host.conf
/etc/issue
/etc/issue.net
/etc/legal
/etc/lsb-release
/etc/profile.d
/etc/profile.d/01-locale-fix.sh
/etc/skel
/etc/update-motd.d
/etc/update-motd.d/00-header
/etc/update-motd.d/10-help-text
/etc/update-motd.d/50-motd-news
/home
/lib
/lib/systemd
/lib/systemd/system
/lib/systemd/system/motd-news.service
/lib/systemd/system/motd-news.timer
/proc
/root
/run
/sbin
/sys
/tmp
/usr
/usr/bin
/usr/bin/locale-check
/usr/games
/usr/include
/usr/lib
/usr/lib/os-release
/usr/sbin
/usr/share
/usr/share/base-files
/usr/share/base-files/dot.bashrc
/usr/share/base-files/dot.profile
/usr/share/base-files/dot.profile.md5sums
/usr/share/base-files/info.dir
/usr/share/base-files/motd
/usr/share/base-files/networks
/usr/share/base-files/profile
/usr/share/base-files/profile.md5sums
/usr/share/base-files/staff-group-for-usr-local
/usr/share/common-licenses
/usr/share/common-licenses/Apache-2.0
/usr/share/common-licenses/Artistic
/usr/share/common-licenses/BSD
/usr/share/common-licenses/CC0-1.0
/usr/share/common-licenses/GFDL-1.2
/usr/share/common-licenses/GFDL-1.3
/usr/share/common-licenses/GPL-1
/usr/share/common-licenses/GPL-2
/usr/share/common-licenses/GPL-3
/usr/share/common-licenses/LGPL-2
/usr/share/common-licenses/LGPL-2.1
/usr/share/common-licenses/LGPL-3
/usr/share/common-licenses/MPL-1.1
/usr/share/common-licenses/MPL-2.0
/usr/share/dict
/usr/share/doc
/usr/share/doc/base-files
/usr/share/doc/base-files/README
/usr/share/doc/base-files/README.FHS
/usr/share/doc/base-files/changelog.gz
/usr/share/doc/base-files/copyright
/usr/share/info
/usr/share/lintian
/usr/share/lintian/overrides
/usr/share/lintian/overrides/base-files
/usr/share/man
/usr/share/misc
/usr/src
/var
/var/backups
/var/cache
/var/lib
/var/lib/dpkg
/var/lib/misc
/var/local
/var/lock
/var/log
/var/run
/var/spool
/var/tmp
/etc/os-release
/usr/share/common-licenses/GFDL
/usr/share/common-licenses/GPL
/usr/share/common-licenses/LGPL
/usr/share/doc/base-files/FAQ
$

Why is this needed:

This would help for more accurate file<->package tracking.

It would also enable someone to use data from syft to build equivalent of dpkg -S.
Consider the following:

# dpkg -S /usr/share/common-licenses/GPL
base-files: /usr/share/common-licenses/GPL

It's not possible to directly answer this question with data from syft right now.

@errordeveloper errordeveloper added the enhancement New feature or request label Mar 4, 2022
@luhring luhring added the ecosystem:os relating to an OS packaging ecosystem label Mar 25, 2022
@joshbressers joshbressers added the good-first-issue Good for newcomers label Jul 19, 2022
@spiffcs
Copy link
Contributor

spiffcs commented Aug 25, 2022

This is an interesting enhancement since it would help a lot with #931. If we can link the package information back up to the correct package manager then when we go to do the vulnerability scanning we won't mislabel said package as installed on its own vs coming from a strict Linux distributions source.

We'll start investigating @errordeveloper so we can get more fidelity around these file relationships!

@tgerla tgerla removed the good-first-issue Good for newcomers label Feb 2, 2023
@spiffcs
Copy link
Contributor

spiffcs commented Feb 9, 2023

After investigating this one we're going to close this as not planned.

*.list files do contain useful information regarding what files/directories are needed for a package to exist, however it's not authoritative on ownership in the same way that syft represents ownership of directory/file.

Example:

cat /var/lib/dpkg/info/tar.list
/.
/bin
/bin/tar
/etc
/usr
/usr/lib
/usr/lib/mime
/usr/lib/mime/packages
/usr/lib/mime/packages/tar
/usr/sbin
/usr/sbin/rmt-tar
/usr/sbin/tarcat
/usr/share
/usr/share/doc
/usr/share/doc/tar
/usr/share/doc/tar/AUTHORS
/usr/share/doc/tar/NEWS.gz
/usr/share/doc/tar/README.Debian
/usr/share/doc/tar/THANKS.gz
/usr/share/doc/tar/changelog.Debian.gz
/usr/share/doc/tar/copyright
/usr/share/man
/usr/share/man/man1
/usr/share/man/man1/tar.1.gz
/usr/share/man/man1/tarcat.1.gz
/usr/share/man/man8
/usr/share/man/man8/rmt-tar.8.gz
/etc/rmt

In this case we don't want to associate all of the shared directory paths as owned by the tar package.

In dpkg - multiple packages can own the same directories. This is not a paradigm we want to introduce as it will cause too much noise when we try to examine ownership-by-file-overlap relationships.

If there is more information we're missing happy to discuss and reopen but for now this is marked as not planned.

@spiffcs spiffcs closed this as not planned Won't fix, can't repro, duplicate, stale Feb 9, 2023
@errordeveloper
Copy link
Author

@spiffcs thanks for looking into this! What if syft could just provide an optional collector for these files and store results in some additional field that a downstream tool can take into consideration only when needed? See my example with that licence file, I couldn't work out where it comes from using syft's output at the time.

@kzantow
Copy link
Contributor

kzantow commented Feb 9, 2023

@errordeveloper could you expand a bit on what you are looking to accomplish? Is it just trying to determine which package "owns" a file?

@errordeveloper
Copy link
Author

Is it just trying to determine which package "owns" a file?

@kzantow yes, in that particular case I wished I had some kind of a hint, instead I ended up hard-coding some lists of files.

@kzantow
Copy link
Contributor

kzantow commented Feb 9, 2023

@errordeveloper the problem with the .list files we found is that the same files are referenced in multiple .list files, so there isn't a clear package owner of a file. Is there any more information you could point us to that might help identify the right package for a particular file?

@errordeveloper
Copy link
Author

...we found is that the same files are referenced in multiple .list files, so there isn't a clear package owner of a file.

Are you referring to directories, as @spiffcs pointed out, or files also?

Is there any more information you could point us to that might help identify the right package for a particular file?

No, I am afraid I don't. I don't believe any package managers do it perfectly well. I guess the only thing I can keep referring to is the dpkg output I shown above, which is an important factor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ecosystem:os relating to an OS packaging ecosystem enhancement New feature or request
Projects
Archived in project
Development

No branches or pull requests

6 participants