-
-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved archive escaping of group and label in file/path names #1046
Conversation
Shouldn't the escaping be done using a whitelist (opt in), not a blacklist (opt out)? I.e., shouldn't it be sanitizing by allowing only certain characters through, rather than specifically removing os.sep? Maybe I'm confused about the purpose. |
I came to the same conclusion which is why I decided to sanitize in same style as the identifier sanitizer but only for specific characters that are problematic for filesystems. Here is the updated example: I think this PR is now ready to review. |
Looks fine to me, though I personally would rather have "," instead of "slash" and "backslash". To each his own. |
If you can suggest a similar replacement for |
Perhaps we should not worry about 'bad' filenames and just address the specific issue mentioned in #1021 i.e slashes causing problems with paths. Maybe we should just sanitize slash to , and backslash to ` and leave the filenames alone otherwise - there are a lot of characters that can cause problems in some situations (one site I've seen recommends avoiding all of |
I guess I can't think of any case where one would want to distinguish between slash and backslash, so I'm happy for those to map to the same thing. slash will come up for any mathematical expressions with division, which I think a lot of people will want to plot. It's hard to see how backslash would be useful in that context, and even harder to see how it would need to be distinguished from slash if it does somehow come up. So I'd favor being concise, which is very likely to be desirable, over distinguishing between cases I don't think would ever come up. In the same vein, I'm happy to map ":" to "=", as again I don't think those would need to be distinguished. But it's up to you; those are just my opinions and gut feelings, and you should go with your own on such ill-defined issues. |
In the end, I've opted to simply replace each of |
The tests have passed so I think this PR is now ready to merge. |
@@ -31,6 +31,15 @@ | |||
from .util import unique_iterator, group_sanitizer, label_sanitizer | |||
|
|||
|
|||
def sanitizer(name, replacements={':':'_', '/':'_', '\\':'_'}): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth sorting the replacement items so this is deterministic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I'll use a list of tuples to make sure the order is fixed.
I think this is now ready to merge once the tests pass. Note that I didn't test the last commit directly. It is a simple change and I'm confident it is correct but it never hurts to double check! |
May have to rebase or merge master to get the tests passing. |
The fact that the PR build passed but the push build didn't is confusing me. Seems to indicate that some interaction with master is making this misbehave. |
13c3974
to
210e09b
Compare
I've rebased against latest master. We'll see what happens now! |
Looks good and tests now passing! |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
The PR aims to address issue #1021. I now know where to implement the escaping fix as illustrated in the first commit. We should discuss how to make this prettier/more robust (I don't like the look of the escaped output yet):
I suppose that instead of using
'-sep-'
I should see if Python offers proper escaping of filenames/paths on Unix/Windows. As I said, this PR is still WIP!