Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixed RTL/LTR text does not follow dir="rtl" when the text starts with LTR characters #1724

Closed
akwizgran opened this issue Sep 22, 2022 · 1 comment

Comments

@akwizgran
Copy link

Thank you for making this amazing piece of software available!

When trying to render an HTML page with dir="rtl" on the document body, WeasyPrint renders everything correctly except for text that starts with LTR characters.

When a heading starts with RTL characters followed by LTR characters, the LTR characters are correctly positioned to the left of the preceding RTL characters (ie at the end of the text). But when a heading starts with LTR characters, the LTR characters are wrongly positioned to the left of the succeeding RTL characters (ie at the end of the text). In both cases, the whole text is correctly aligned to the right.

Example of correct rendering with Firefox:

Screenshot_2022-09-22_11-40-47

In the first line, the LTR word "Briar", which comes at the end of the text, is correctly positioned at the left. In the second line, the LTR word "Briar", which comes at the start of the text, is correctly positioned at the right.

Example of incorrect rendering with Weasyprint 56.1:

Screenshot_2022-09-22_11-41-17

In the first line, the LTR word "Briar", which comes at the end of the text, is correctly positioned at the left. But in the second line, the LTR word "Briar", which comes at the start of the text, is wrongly positioned at the left.

I believe the issue is caused by Weasyprint rather than by the PDF viewer (Evince), as printing the Firefox page to a PDF produces the correct result in the same PDF viewer.

Speculation: Some component (Pango?) is guessing whether each piece of text should be treated as LTR or RTL based on the initial characters. Perhaps this component can be persuaded to use the text direction that's currently in effect (which for this document is RTL throughout)?

The page in question is here: https://briarproject.org/quick-start/fa

@liZe
Copy link
Member

liZe commented Sep 23, 2022

Hi!

Thanks for this bug report and for the kind words.

I believe the issue is caused by Weasyprint rather than by the PDF viewer (Evince), as printing the Firefox page to a PDF produces the correct result in the same PDF viewer.

It is. Bidirectional text is known to be very limited in WeasyPrint.

Speculation: Some component (Pango?) is guessing whether each piece of text should be treated as LTR or RTL based on the initial characters. Perhaps this component can be persuaded to use the text direction that's currently in effect (which for this document is RTL throughout)?

Pango works very well with RTL and bidi, the problem is in WeasyPrint. When we only have RTL or LTR text, things are often OK because the different glyph clusters have to be displayed in the same order. When we have bidirectional text, we have to take care of the order by ourselves, and that’s where we don’t have the code (yet) to handle this.

The overall support of RTL and bidi regularly improves, but having a clean bidi support would require a lot of work. If you want it to happen faster, you can even answer our survey! RTL support wasn’t very popular last year, but it may be better in 2022.

I’ll close this issue as a duplicate of #106, we can continue the discussion over there.

@liZe liZe closed this as not planned Won't fix, can't repro, duplicate, stale Sep 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants