Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefer application/x-ole-storage instead of application/x-tika-msoffice #54

Merged
merged 1 commit into from
May 21, 2021

Conversation

gmcgibbon
Copy link
Member

@gmcgibbon gmcgibbon commented May 19, 2021

Closes #44

Prefer application/x-ole-storage as a fallback for ms-office documents. This was the behaviour of Marcel 0.33.

However, I'm resorting to this because I can't find a good magic matcher to identify *.olf files. The header of the binary file is already used to identify it is an OLE/office type file, and there doesn't appear to be (AFAICS) any bytes we can read at a consistent offset to denote application/vnd.ms-outlook.

The more specific matching matchers for office subtypes appear to be very subtle though. For example, the one we use for application/msword:

marcel/data/custom.xml

Lines 13 to 17 in 85c2559

<magic priority="50">
<match value="0xd0cf11e0a1b11ae1" type="string" offset="0:8">
<match value="jbjb" type="string" offset="546" />
<match value="bjbj" type="string" offset="546" />
</match>

Regardless, I think we can agree that x-tika-msoffice was supposed to be an internal type grouping that shouldn't be surfaced in mime type detection.

Prefer application/x-ole-storage as a fallback for ms-office documents.
@gmcgibbon gmcgibbon requested a review from pixeltrix May 19, 2021 23:30
@pixeltrix
Copy link
Contributor

@gmcgibbon I'll leave these for you to merge after approval so you can manage any conflicts that may arise.

@gmcgibbon gmcgibbon merged commit 9f34b22 into rails:main May 21, 2021
@gmcgibbon gmcgibbon deleted the application/x-ole-storage branch May 21, 2021 19:13
thomasleese added a commit to ministryofjustice/hmpps-book-secure-move-api that referenced this pull request Oct 8, 2021
Following a recent change in Marcel:
rails/marcel#54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Various office files wrong mimetype
2 participants