Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gather data on Extended Attributes and Alternate Data Streams/File Forks #180

Open
nightlark opened this issue Apr 11, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@nightlark
Copy link
Collaborator

Most operating systems (Linux, macOS, and even Windows) support attaching key value pairs to files and folders via Extended Attributes; the data in these is not included in the file hash, and can theoretically be basically an entire file worth of content depending on the OS (max size of value can be something like 4KB up to the max size of a regular file). We should try to capture identifying information on the extended attributes that are present, and in some cases capture the information for helping identify a file (often web browsers will add the URL a file was downloaded from as an extended attribute).

Some operating systems also support alternate data streams (Windows) or file/resource forks (macOS kinda, and certain file systems like zfs for BSD/Linux, potentially Solaris). These can be entirely separate "hidden" files that are attached to a file, often with no limit on the maximum size of the data -- and the file hash we capture doesn't include any of this information. We should check files for the presence of these alternate data streams, and capture hashes. The contents may also be interesting (e.g. Windows web browsers storing the URL a downloaded file came from).

The trickiest bit is that this information often is not preserved when moving between file systems/OSes. Some detection logic to see if e.g. a tar file stores extended attributes would be interesting to warn the user creating an SBOM that they might be missing out on capturing some information would likely be useful. Testing of different archive/fs formats (tar, squashfs, 7z, zip, etc) to see what can preserve these things in the archive, and which OSes the information can be extracted on would be a good idea.

@nightlark nightlark added the enhancement New feature or request label Apr 11, 2024
@nightlark
Copy link
Collaborator Author

nightlark commented May 24, 2024

On the subject of "hidden" files, at Black Hat Asia 2024, there was this interesting talk: https://www.blackhat.com/asia-24/briefings/schedule/#magicdot-a-hackers-magic-show-of-disappearing-dots-and-spaces-36561

Essentially Windows does a conversion from paths such as C:\Windows\etc to an NtPath in the form \??\C:\Windows\etc for file operations. However, to maintain backwards compatibility it has some interesting behavior: removes trailing .'s from file names and remove trailing <space> from the end of the last path element. This leads to weird things like deleting a.txt. actually deleting a file named a.txt, and using shortnames (another backwards compatibility feature) it is possible to target files with a completely different name. Hidden files can also be created in zip files using this same file naming trick on Windows. Coupled with new-ish support for symlinks things can get interesting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant