-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEV: Fix changelog for UTF-8 characters #2462
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2462 +/- ##
=======================================
Coverage 94.44% 94.44%
=======================================
Files 49 49
Lines 8027 8027
Branches 1618 1618
=======================================
Hits 7581 7581
Misses 276 276
Partials 170 170 ☔ View full report in Codecov by Sentry. |
I have added some basic test as well which covers the UTF-8 issue as well as the username issues from #2246. To simplify testing, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work Stefan! You're bringing our tooling on a whole new level 🎉
Approved - feel free to merge whenever you want :-) |
I just do not like when changelogs have easy to fix issues ;) And trying to test even small tooling is something I consider essential - especially as these tools should "just work" and are more likely to not receive much love from the outside. |
## What's new Generating name objects (`NameObject`) without a leading slash is considered deprecated now. Previously, just a plain warning would be logged, leading to possibly invalid PDF files. According to our deprecation policy, this will log a *DeprecationWarning* for now. ### New Features (ENH) - Add get_pages_from_field (#2494) by @pubpub-zz - Add reattach_fields function (#2480) by @pubpub-zz - Automatic access to pointed object for IndirectObject (#2464) by @pubpub-zz ### Bug Fixes (BUG) - Missing error on name without leading / (#2387) by @Rak424 - encode_pdfdocencoding() always returns bytes (#2440) by @sbourlon - BI in text content identified as image tag (#2459) by @pubpub-zz ### Robustness (ROB) - Missing basefont entry in type 3 font (#2469) by @pubpub-zz ### Documentation (DOC) - Improve lossless compression example (#2488) by @j-t-1 - Amend robustness documentation (#2479) by @j-t-1 ### Developer Experience (DEV) - Fix changelog for UTF-8 characters (#2462) by @stefan6419846 ### Maintenance (MAINT) - Add _get_page_number_from_indirect in writer (#2493) by @pubpub-zz - Remove user assignment for feature requests (#2483) by @stefan6419846 - Remove reference to old 2.0.0 branch (#2482) by @stefan6419846 ### Testing (TST) - Fix benchmark failures (#2481) by @stefan6419846 - Broken test due to expired test file URL (#2468) by @pubpub-zz - Resolve file naming conflict in test_iss1767 (#2445) by @sbourlon [Full Changelog](4.0.2...4.1.0)
## What's new Generating name objects (`NameObject`) without a leading slash is considered deprecated now. Previously, just a plain warning would be logged, leading to possibly invalid PDF files. According to our deprecation policy, this will log a *DeprecationWarning* for now. ### New Features (ENH) - Add get_pages_from_field (#2494) by @pubpub-zz - Add reattach_fields function (#2480) by @pubpub-zz - Automatic access to pointed object for IndirectObject (#2464) by @pubpub-zz ### Bug Fixes (BUG) - Missing error on name without leading / (#2387) by @Rak424 - encode_pdfdocencoding() always returns bytes (#2440) by @sbourlon - BI in text content identified as image tag (#2459) by @pubpub-zz ### Robustness (ROB) - Missing basefont entry in type 3 font (#2469) by @pubpub-zz ### Documentation (DOC) - Improve lossless compression example (#2488) by @j-t-1 - Amend robustness documentation (#2479) by @j-t-1 ### Developer Experience (DEV) - Fix changelog for UTF-8 characters (#2462) by @stefan6419846 ### Maintenance (MAINT) - Add _get_page_number_from_indirect in writer (#2493) by @pubpub-zz - Remove user assignment for feature requests (#2483) by @stefan6419846 - Remove reference to old 2.0.0 branch (#2482) by @stefan6419846 ### Testing (TST) - Fix benchmark failures (#2481) by @stefan6419846 - Broken test due to expired test file URL (#2468) by @pubpub-zz - Resolve file naming conflict in test_iss1767 (#2445) by @sbourlon [Full Changelog](4.0.2...4.1.0)
make_release.py
did not correctly handle UTF-8 characters before, for example→
, which was rendered as\xe2\x86\x92
: cc306ad and https://github.com/py-pdf/pypdf/releases/tag/4.0.2The reason is that
str(b"abc")
yields the string representation"b'abc'"
of the bytes type instead of the actual string value we want here. Properly decoding the bytes usingb"abc".decode()
does indeed yield the correct result"abc"
.The
lines
variable beforeand after:
Especially have a look at the entry for #2426, which now renders correctly.