Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix duplicate tags when decoding MPL2 #68

Merged
merged 5 commits into from
Nov 9, 2017

Conversation

MoshiMoshi0
Copy link
Contributor

MPL2:

//All things weird are normal
/_/_in this whore of cities.

Internal markup (before):

<i><i>All things weird are normal</i></i>
<i><u><i><u>in this whore of cities.</u></i></u></i>

Internal markup (after):

<i>/All things weird are normal</i>
<i><u>/_in this whore of cities.</u></i>

@otsaloma
Copy link
Owner

otsaloma commented Nov 8, 2017

Nice to see someone look at the code, those markup conversions are not from the simplest or clearest end.

Why do you suggest to leave the duplicate tag as part of the text? Wouldn't it make more sense to remove the duplicates, something like for tag in reversed(sorted(set(match.group(1)))): in the existing code?

@MoshiMoshi0
Copy link
Contributor Author

That was actually my first thought.

But then what if you want to display one of the tags as text in the beginning of the line? You would need to offset with space.
But I guess it might be fine because if you dont want any style on the line AND you want tag as text you need to offset with space too.

So its up to you, I can revert and just remove the duplicate tags. Dont know if there is any MPL2 docs as I cant find any, maybe its described there.

@otsaloma
Copy link
Owner

otsaloma commented Nov 8, 2017

I think those characters have been chosen for markup specifically because no meaningful text begins with "\", "/" or "_". I think we can assume those duplicates are markup errors, not in any part intended. So, yes, please change the code to remove duplicate tags, that should be a single line change to the existing code, and update the unit test.

For docs, see below, not that it helps much.

http://web.archive.org/web/20090328040233/http://napisy.ussbrowarek.org/mpl2-eng.html

@MoshiMoshi0
Copy link
Contributor Author

Done.
The other tests were failing because the order got changed. Figured it would be better to preserve order then to fix the tests. It would be very confusing for others anyway.
Hope you are fine with one import, could be done with no imports but the code is cleaner this way.

I tried digging more info on MPL2 but could not find anything more than in your link. I also checked out how some players like mphc parse MPL2 and could not find bold/underline tags mentioned anywhere, so I dont know how they ended up here.

@otsaloma
Copy link
Owner

otsaloma commented Nov 9, 2017

Thanks, looks good to me!

@otsaloma otsaloma merged commit d6a8049 into otsaloma:master Nov 9, 2017
@otsaloma
Copy link
Owner

otsaloma commented Nov 9, 2017

I also checked out how some players like mphc parse MPL2 and could not find bold/underline tags mentioned anywhere, so I dont know how they ended up here.

I can't find the source anymore, but my recollection is that there was a longer Polish-language documentation that included those bold and underline and I have followed that documentation. I don't remember how official that was. From there I have also taken the option to use MicroDVD markup with MPL2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants