Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make mso detection work similar to what file/file does #587

Merged
merged 2 commits into from
Oct 8, 2024

Conversation

gabriel-vasile
Copy link
Owner

https://github.com/file/file/blob/7c62d696b06e53fc5be015c41a57513278ac6c54/magic/Magdir/msooxml
The algorithm used by file/file is not 100% percent reliable. For example, a zero compression zip containing a docx will still sometimes be detected as docx instead of zip (it depends on how many files and the order of files in the zip.) There is no way to make mso detection 100% reliable because traversing zip entries can only be done when the central directory entry is available. That's not the case with mimetype because it's limited to only reading the header of the file.

Second thing in this PR is removing some test data fixtures. From now, I'll try as much as possible to write regular unit tests without relying on test file fixtures. #575 (comment) related #550 #575
closes #400

https://github.com/file/file/blob/7c62d696b06e53fc5be015c41a57513278ac6c54/magic/Magdir/msooxml
The algorithms is not 100% percent reliable. For example, a
zero compression zip containing a docx will still sometimes be detected
as docx instead of zip (it depends on how many files and the order of
files in the zip)

Second thing in this PR is removing some test data fixtures.
From now, I'll try as much as possible to write regular unit tests
without relying on test file fixtures. #575 (comment)
related #550 #575
closes #400
internal/magic/zip.go Outdated Show resolved Hide resolved
The check is already done in parent function.
@gabriel-vasile gabriel-vasile merged commit 295fa96 into master Oct 8, 2024
5 checks passed
@gabriel-vasile gabriel-vasile deleted the zip_mso branch October 8, 2024 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

.zip incorrectly detected as .xlsx possible regression
1 participant