DEV: Fix changelog for UTF-8 characters #2462

stefan6419846 · 2024-02-19T11:57:54Z

make_release.py did not correctly handle UTF-8 characters before, for example →, which was rendered as \xe2\x86\x92: cc306ad and https://github.com/py-pdf/pypdf/releases/tag/4.0.2

The reason is that str(b"abc") yields the string representation "b'abc'" of the bytes type instead of the actual string value we want here. Properly decoding the bytes using b"abc".decode() does indeed yield the correct result "abc".

The lines variable before

['"cc306ad6abfb232f6922a7f0e939831d6611d0b7:::REL: 4.0.2:::Martin Thoma"', '"b7bfd0d7eddfd0865a94cc9e7027df6596242cf7:::BUG: Use NumberObject for /Border elements of annotations (#2451):::rsinger417"', '"8cacb0fc8fee9920b0515d1289e6ee8191eb3f21:::DOC: Document easier way to update metadata (#2454):::Stefan"', '"3fb63f7e3839ce39ac98978c996f3086ba230a20:::TST: Avoid catching not emitted warnings (#2429):::Stefan"', '"61b73d49778e8f0fb172d5323e67677c9974e420:::DOC: Typo `Polyline` \\xe2\\x86\\x92 `PolyLine` in adding-pdf-annotations.md (#2426):::CWKSC"', '"f851a532a5ec23b572d86bd7185b327a3fac6b58:::DEV: Bump codecov/codecov-action from 3 to 4 (#2430):::dependabot[bot]"']

and after:

['"cc306ad6abfb232f6922a7f0e939831d6611d0b7:::REL: 4.0.2:::Martin Thoma"', '"b7bfd0d7eddfd0865a94cc9e7027df6596242cf7:::BUG: Use NumberObject for /Border elements of annotations (#2451):::rsinger417"', '"8cacb0fc8fee9920b0515d1289e6ee8191eb3f21:::DOC: Document easier way to update metadata (#2454):::Stefan"', '"3fb63f7e3839ce39ac98978c996f3086ba230a20:::TST: Avoid catching not emitted warnings (#2429):::Stefan"', '"61b73d49778e8f0fb172d5323e67677c9974e420:::DOC: Typo `Polyline` → `PolyLine` in adding-pdf-annotations.md (#2426):::CWKSC"', '"f851a532a5ec23b572d86bd7185b327a3fac6b58:::DEV: Bump codecov/codecov-action from 3 to 4 (#2430):::dependabot[bot]"']

Especially have a look at the entry for #2426, which now renders correctly.

codecov · 2024-02-19T12:04:13Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.44%. Comparing base (af36667) to head (14c1606).

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2462   +/-   ##
=======================================
  Coverage   94.44%   94.44%           
=======================================
  Files          49       49           
  Lines        8027     8027           
  Branches     1618     1618           
=======================================
  Hits         7581     7581           
  Misses        276      276           
  Partials      170      170

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

stefan6419846 · 2024-02-26T15:56:20Z

I have added some basic test as well which covers the UTF-8 issue as well as the username issues from #2246. To simplify testing, the rich import has been made internal as I did not yet feel like updating the requirements files just for this ;)

MartinThoma

Very nice work Stefan! You're bringing our tooling on a whole new level 🎉

MartinThoma · 2024-02-26T16:04:22Z

Approved - feel free to merge whenever you want :-)

stefan6419846 · 2024-02-26T16:08:39Z

I just do not like when changelogs have easy to fix issues ;) And trying to test even small tooling is something I consider essential - especially as these tools should "just work" and are more likely to not receive much love from the outside.

@pubpub-zz

## What's new Generating name objects (`NameObject`) without a leading slash is considered deprecated now. Previously, just a plain warning would be logged, leading to possibly invalid PDF files. According to our deprecation policy, this will log a *DeprecationWarning* for now. ### New Features (ENH) - Add get_pages_from_field (#2494) by @pubpub-zz - Add reattach_fields function (#2480) by @pubpub-zz - Automatic access to pointed object for IndirectObject (#2464) by @pubpub-zz ### Bug Fixes (BUG) - Missing error on name without leading / (#2387) by @Rak424 - encode_pdfdocencoding() always returns bytes (#2440) by @sbourlon - BI in text content identified as image tag (#2459) by @pubpub-zz ### Robustness (ROB) - Missing basefont entry in type 3 font (#2469) by @pubpub-zz ### Documentation (DOC) - Improve lossless compression example (#2488) by @j-t-1 - Amend robustness documentation (#2479) by @j-t-1 ### Developer Experience (DEV) - Fix changelog for UTF-8 characters (#2462) by @stefan6419846 ### Maintenance (MAINT) - Add _get_page_number_from_indirect in writer (#2493) by @pubpub-zz - Remove user assignment for feature requests (#2483) by @stefan6419846 - Remove reference to old 2.0.0 branch (#2482) by @stefan6419846 ### Testing (TST) - Fix benchmark failures (#2481) by @stefan6419846 - Broken test due to expired test file URL (#2468) by @pubpub-zz - Resolve file naming conflict in test_iss1767 (#2445) by @sbourlon [Full Changelog](4.0.2...4.1.0)

@pubpub-zz

## What's new Generating name objects (`NameObject`) without a leading slash is considered deprecated now. Previously, just a plain warning would be logged, leading to possibly invalid PDF files. According to our deprecation policy, this will log a *DeprecationWarning* for now. ### New Features (ENH) - Add get_pages_from_field (#2494) by @pubpub-zz - Add reattach_fields function (#2480) by @pubpub-zz - Automatic access to pointed object for IndirectObject (#2464) by @pubpub-zz ### Bug Fixes (BUG) - Missing error on name without leading / (#2387) by @Rak424 - encode_pdfdocencoding() always returns bytes (#2440) by @sbourlon - BI in text content identified as image tag (#2459) by @pubpub-zz ### Robustness (ROB) - Missing basefont entry in type 3 font (#2469) by @pubpub-zz ### Documentation (DOC) - Improve lossless compression example (#2488) by @j-t-1 - Amend robustness documentation (#2479) by @j-t-1 ### Developer Experience (DEV) - Fix changelog for UTF-8 characters (#2462) by @stefan6419846 ### Maintenance (MAINT) - Add _get_page_number_from_indirect in writer (#2493) by @pubpub-zz - Remove user assignment for feature requests (#2483) by @stefan6419846 - Remove reference to old 2.0.0 branch (#2482) by @stefan6419846 ### Testing (TST) - Fix benchmark failures (#2481) by @stefan6419846 - Broken test due to expired test file URL (#2468) by @pubpub-zz - Resolve file naming conflict in test_iss1767 (#2445) by @sbourlon [Full Changelog](4.0.2...4.1.0)

DEV: Fix changelog for UTF-8 characters

dc772b8

stefan6419846 added the nf-packaging Non-functional change: Packaging and distribution label Feb 19, 2024

stefan6419846 mentioned this pull request Feb 19, 2024

DEV: Testing for make_release.py #2463

Closed

stefan6419846 added 2 commits February 26, 2024 13:20

Merge branch 'main' into changelog

1da04d3

add tests

07ebde0

stefan6419846 linked an issue Feb 26, 2024 that may be closed by this pull request

DEV: Testing for make_release.py #2463

Closed

fix for old Python versions

14c1606

MartinThoma approved these changes Feb 26, 2024

View reviewed changes

stefan6419846 merged commit 2b3051b into py-pdf:main Feb 26, 2024
15 checks passed

stefan6419846 deleted the changelog branch February 26, 2024 16:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEV: Fix changelog for UTF-8 characters #2462

DEV: Fix changelog for UTF-8 characters #2462

stefan6419846 commented Feb 19, 2024 •

edited

Loading

codecov bot commented Feb 19, 2024 •

edited

Loading

stefan6419846 commented Feb 26, 2024

MartinThoma left a comment

MartinThoma commented Feb 26, 2024

stefan6419846 commented Feb 26, 2024

DEV: Fix changelog for UTF-8 characters #2462

DEV: Fix changelog for UTF-8 characters #2462

Conversation

stefan6419846 commented Feb 19, 2024 • edited Loading

codecov bot commented Feb 19, 2024 • edited Loading

Codecov Report

stefan6419846 commented Feb 26, 2024

MartinThoma left a comment

Choose a reason for hiding this comment

MartinThoma commented Feb 26, 2024

stefan6419846 commented Feb 26, 2024

stefan6419846 commented Feb 19, 2024 •

edited

Loading

codecov bot commented Feb 19, 2024 •

edited

Loading