Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected edge case behavior when using text shaping #884

Closed
marcstober opened this issue Aug 11, 2023 · 2 comments
Closed

Unexpected edge case behavior when using text shaping #884

marcstober opened this issue Aug 11, 2023 · 2 comments

Comments

@marcstober
Copy link

I am experiencing some unexpected behavior when using the new text shaping feature (#820) with some very specific code. In one case, there a question-mark-in-a-box characters appearing (as if there is a missing glyph); in another case, some text does not appear at all.

NOTE: I discovered these issues in testing bidi with text shaping (see #549). I realize bidi does not work with text shaping, for which there is #882. However, I am seeing some unexpected behavior beyond just the text shaping not working.

Steps to Reproduce

Run this script:

import os.path

from fpdf import FPDF

from bidi.algorithm import get_display 

pdf = FPDF(unit="in", format="Letter")
pdf.add_font("SBL_Hbrw", fname="SBL_Hbrw.ttf")
pdf.set_font("SBL_Hbrw", "", 30)

pdf.set_text_shaping(True)

pdf.add_page()

some_text = "בְּרֵאשִׁ֖ית"

# Hebrew by itself with vowels/points works with text shaping turned on
pdf.set_xy(1, 1)
pdf.cell(6.5, .8, some_text)

# Trying to use get_display() with text shaping - it puts consonants in the right order, but the vowels/points are wrong 
# - but worse, it corrupts the following text with a bunch of question-mark-in-a-box characters.
pdf.set_xy(1, 3)
pdf.cell(6.5, .8, get_display("The first word of the Bible is " + some_text + "."))

# this text is corrupt with a bunch of question-mark-in-a-box characters
pdf.set_xy(1, 4)
pdf.cell(6.5, .8, 'אנגלית (באנגלית: English) ה') 

# this text is corrupt with a bunch of question-mark-in-a-box characters
pdf.set_xy(1, 5)
pdf.cell(6.5, .8, 'אנגלית (באנגלית: English) ') 


filename = os.path.splitext(__file__)[0] + ".pdf"
pdf.output(filename)

The output is:
hebrew-missing_glyphs.pdf
Image exported from Adobe Acrobat:
hebrew-missing_glyphs

Also, run this code, which is almost the same except lacks the call to get_display():

import os.path

from fpdf import FPDF

pdf = FPDF(unit="in", format="Letter")
pdf.add_font("SBL_Hbrw", fname="SBL_Hbrw.ttf")
pdf.set_font("SBL_Hbrw", "", 30)

pdf.set_text_shaping(True)

pdf.add_page()

some_text = "בְּרֵאשִׁ֖ית"

# Hebrew by itself with vowels/points works with text shaping turned on
pdf.set_xy(1, 1)
pdf.cell(6.5, .8, some_text)

# Trying to use bidi with text shaping - it puts consonants in the wrong order, which is expected for now;
# BUT, the following text cells don't appear at all, which is unexpected.
pdf.set_xy(1, 3)
pdf.cell(6.5, .8, "The first word of the Bible is " + some_text + ".")

# this doesn't appear!
pdf.set_xy(1, 4)
pdf.cell(6.5, .8, 'אנגלית (באנגלית: English) ה') 

# this doesn't appear!
pdf.set_xy(1, 5)
pdf.cell(6.5, .8, 'אנגלית (באנגלית: English) ') 


filename = os.path.splitext(__file__)[0] + ".pdf"
pdf.output(filename)

The output is:
hebrew-missing_text.pdf
I tried to export an image from Acrobat, but got this error about "Unterminated string":
Screenshot 2023-08-11 185918

Environment
Please provide the following information:

  • Operating System: Windows 11
  • Python version: 3.10.5
  • fpdf2 version used: 2.7.5 (editable install from github)
  • PDF viewer: Adobe Acrobat Standard, 64 bit, v. 2023.003.20269
@marcstober marcstober added the bug label Aug 11, 2023
@Lucas-C
Copy link
Member

Lucas-C commented Aug 13, 2023

Thank you for the detailed bug report @marcstober!

I'm not the most expert with text shaping.
Maybe @gmischler or @andersonhc would like to get a look at this?

@Lucas-C
Copy link
Member

Lucas-C commented Aug 15, 2023

This was solved by @andersonhc in PR #889

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants