-
-
Notifications
You must be signed in to change notification settings - Fork 680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Punctuation wrongly affects character count for hyphenation #109
Comments
Yes, that's a problem. We must rewrite the way WP handles text, that's on our TODO-list, and there are some annoying bugs related to that limitation (#74, #100, #106). Of course, it may also possible to write a quick fix for this bug, but rewriting the whole module will be necessary for the other bugs. |
netbsd-srcmastr
referenced
this issue
in NetBSD/pkgsrc
Feb 21, 2019
Version 45 ---------- Released on 2019-02-20. WeasyPrint now has a `code of conduct <https://github.com/Kozea/WeasyPrint/blob/master/CODE_OF_CONDUCT.rst>`_. A new website has been launched, with beautiful and useful graphs about speed and memory use across versions: check `WeasyPerf <https://kozea.github.io/WeasyPerf/index.html>`_. Dependencies: * Python 3.5+ is now needed, Python 3.4 is not supported anymore Bug fixes: * `798 <https://github.com/Kozea/WeasyPrint/pull/798>`_: Prevent endless loop and index out of range in pagination * `767 <https://github.com/Kozea/WeasyPrint/issues/767>`_: Add a ``--quiet`` CLI parameter * `784 <https://github.com/Kozea/WeasyPrint/pull/784>`_: Fix library loading on Alpine * `791 <https://github.com/Kozea/WeasyPrint/pull/791>`_: Use path2url in tests for Windows * `789 <https://github.com/Kozea/WeasyPrint/pull/789>`_: Add LICENSE file to distributed sources * `788 <https://github.com/Kozea/WeasyPrint/pull/788>`_: Fix pending references * `780 <https://github.com/Kozea/WeasyPrint/issues/780>`_: Don't draw patterns for empty page backgrounds * `774 <https://github.com/Kozea/WeasyPrint/issues/774>`_: Don't crash when links include quotes * `637 <https://github.com/Kozea/WeasyPrint/issues/637>`_: Fix a problem with justified text * `763 <https://github.com/Kozea/WeasyPrint/pull/763>`_: Launch tests with Python 3.7 * `704 <https://github.com/Kozea/WeasyPrint/issues/704>`_: Fix a corner case with tables * `804 <https://github.com/Kozea/WeasyPrint/pull/804>`_: Don't logger handlers defined before importing WeasyPrint * `109 <https://github.com/Kozea/WeasyPrint/issues/109>`_, `748 <https://github.com/Kozea/WeasyPrint/issues/748>`_: Don't include punctuation for hyphenation * `770 <https://github.com/Kozea/WeasyPrint/issues/770>`_: Don't crash when people use uppercase words from old-fashioned Microsoft fonts in tables, especially when there's an 5th column * Use a `separate logger <https://weasyprint.readthedocs.io/en/latest/tutorial.htmllogging>`_ to report the rendering process * Add a ``--debug`` CLI parameter and set debug level for unknown prefixed CSS properties * Define minimal versions of Python and setuptools in setup.cfg Documentation * `796 <https://github.com/Kozea/WeasyPrint/pull/796>`_: Fix a small typo in the tutorial * `792 <https://github.com/Kozea/WeasyPrint/pull/792>`_: Document no alignement character support * `773 <https://github.com/Kozea/WeasyPrint/pull/773>`_: Fix phrasing in Hacking section * `402 <https://github.com/Kozea/WeasyPrint/issues/402>`_: Add a paragraph about fontconfig error * `764 <https://github.com/Kozea/WeasyPrint/pull/764>`_: Fix list of dependencies for Alpine * Fix API documentation of HTML and CSS classes
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Normally short words aren't ever split across lines and hyphenated, but if they have adjacent punctuation then WeasyPrint (version 0.19.2) wrongly treats them as though they were a longer word.
We've found “(LST)” being split as “(L-” at the end of one line and “ST)” at the start of the next. Evidence that the parens are being counted as word characters: setting this property avoids “(LST)” being split:
(But obviously that could still hyphenate a 4-letter word with 2 adjacent punctuation marks. And in the general case requires setting
hyphenate-limit-chars
higher than you wish, thereby also disallowing hyphenating some words without punctuation which you'd wish to allow.)CSS says to strip punctuation characters between words for counting their characters: http://dev.w3.org/csswg/css-text-4/#hyphenate-char-limits
Pyphen say that punctuation-stripping should be done outside of Pyphen:
Kozea/Pyphen#4
Let me know if you'd like a sample document showing this happening.
The text was updated successfully, but these errors were encountered: