-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add filename sanitizing to adapters #113
Conversation
Add failing ffmpeg tests Add filename sanitizing to ffmpeg and aws adapters Appease rubocop and adjust sanitizing Sanitzing adjustment Appease rubocop Adjust Sanitization Add additional sanitizing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about making ActiveEncode
be the path for calling the sanitization methods?
ActiveEncode.sanitize_filename(input_url)
This would avoid having to require the new module in all of the adapters and would make it easily available outside of ActiveEncode's codebase for use in apps that use ActiveEncode. It would make method calls in adapters longer but I think it would avoid some bloat since the methods are generic and the adapters don't really need to provide those methods.
I think this is possible by adding the require to lib/active_encode.rb
then declaring the module to extend the sanitization module:
module ActiveEncode
extend ActiveEncode::FilenameSanitizer
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look good to me, but I'm a little worried that there is the possibility that previously created encodes could be orphaned since this changes where ActiveEncode looks for files. Do you think this could possibly happen? If so, we could make this a major release due to the change in behavior.
I think the FileLocator is only really used by the engine adapters when doing the encoding and I don't think that previously created encodes reference it. But I may have overlooked something in the code or how external apps/prior encodes interact with it. So, I hope its not the case but it orphaning some previous encodes is a possibility. |
Co-authored-by: Chris Colvard <[email protected]> If a file is already successfully in an s3 bucket and is not being moved to a new bucket we do not need to do anything with the filename. If the file is being moved to a new bucket, we should sanitize it to the stricter set of allowed characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks for your effort on this!
This PR moves filename sanitizing from the ffmpeg and passthrough adapters into a separate module. Additional sanitizing is added and applied in the two listed adapters as well as the AWS adapters.
Sanitizing is strict, replacing any special characters outside the allowed set with
_
and removing all excess periods. The allowed set isA-Z
,a-z
,0-9
,.
,-
, and_
. This is a stricter version of the list provided by AWS for object-keys. Additionally,/
is not sanitized because of how Avalon filenames are formatted when passed to Active Encode.