-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
♻️ REFACTOR: Replace character codes with strings #270
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #270 +/- ##
==========================================
- Coverage 95.68% 95.25% -0.43%
==========================================
Files 62 62
Lines 3315 3333 +18
==========================================
+ Hits 3172 3175 +3
- Misses 143 158 +15
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
chrisjsewell
added a commit
to executablebooks/mdit-py-plugins
that referenced
this pull request
Jun 1, 2023
executablebooks/markdown-it-py#270 deprecates `srcCharCode` and makes it immutable.
chrisjsewell
added a commit
to executablebooks/mdit-py-plugins
that referenced
this pull request
Jun 1, 2023
executablebooks/markdown-it-py#270 deprecates `srcCharCode` and makes it immutable.
Closed
netbsd-srcmastr
pushed a commit
to NetBSD/pkgsrc
that referenced
this pull request
Jun 6, 2023
## 3.0.0 - 2023-06-03⚠️ This release contains some minor breaking changes in the internal API and improvements to the parsing strictness. **Full Changelog**: <executablebooks/markdown-it-py@v2.2.0...v3.0.0> ### ⬆️ UPGRADE: Drop support for Python 3.7 Also add testing for Python 3.11 ### ⬆️ UPGRADE: Update from upstream markdown-it `12.2.0` to `13.0.0` A key change is the addition of a new `Token` type, `text_special`, which is used to represent HTML entities and backslash escaped characters. This ensures that (core) typographic transformation rules are not incorrectly applied to these texts. The final core rule is now the new `text_join` rule, which joins adjacent `text`/`text_special` tokens, and so no `text_special` tokens should be present in the final token stream. Any custom typographic rules should be inserted before `text_join`. A new `linkify` rule has also been added to the inline chain, which will linkify full URLs (e.g. `https://example.com`), and fixes collision of emphasis and linkifier (so `http://example.org/foo._bar_-_baz` is now a single link, not emphasized). Emails and fuzzy links are not affected by this. * ♻️ Refactor backslash escape logic, add `text_special` [#276](executablebooks/markdown-it-py#276) * ♻️ Parse entities to `text_special` token [#280](executablebooks/markdown-it-py#280) * ♻️ Refactor: Add linkifier rule to inline chain for full links [#279](executablebooks/markdown-it-py#279) *‼️ Remove `(p)` => `§` replacement in typographer [#281](executablebooks/markdown-it-py#281) *‼️ Remove unused `silent` arg in `ParserBlock.tokenize` [#284](executablebooks/markdown-it-py#284) * 🐛 FIX: numeric character reference passing [#272](executablebooks/markdown-it-py#272) * 🐛 Fix: tab preventing paragraph continuation in lists [#274](executablebooks/markdown-it-py#274) * 👌 Improve nested emphasis parsing [#273](executablebooks/markdown-it-py#273) * 👌 fix possible ReDOS in newline rule [#275](executablebooks/markdown-it-py#275) * 👌 Improve performance of `skipSpaces`/`skipChars` [#271](executablebooks/markdown-it-py#271) * 👌 Show text of `text_special` in `tree.pretty` [#282](executablebooks/markdown-it-py#282) ### ♻️ REFACTOR: Replace most character code use with strings The use of `StateBase.srcCharCode` is deprecated (with backward-compatibility), and all core uses are replaced by `StateBase.src`. Conversion of source string characters to an integer representing the Unicode character is prevalent in the upstream JavaScript implementation, to improve performance. However, it is unnecessary in Python and leads to harder to read code and performance deprecations (during the conversion in the `StateBase` initialisation). See [#270](executablebooks/markdown-it-py#270), thanks to [@hukkinj1](https://github.com/hukkinj1). ### ♻️ Centralise indented code block tests For CommonMark, the presence of indented code blocks prevent any other block element from having an indent of greater than 4 spaces. Certain Markdown flavors and derivatives, such as mdx and djot, disable these code blocks though, since it is more common to use code fences and/or arbitrary indenting is desirable. Previously, disabling code blocks did not remove the indent limitation, since most block elements had the 3 space limitation hard-coded. This change centralised the logic of applying this limitation (in `StateBlock.is_code_block`), and only applies it when indented code blocks are enabled. This allows for e.g. ```md <div> <div> I can indent as much as I want here. <div> <div> ``` See [#260](executablebooks/markdown-it-py#260) ### 🔧 Maintenance changes Strict type annotation checking has been applied to the whole code base, [ruff](https://github.com/charliermarsh/ruff) is now used for linting, and fuzzing tests have been added to the CI, to integrate with Google [OSS-Fuzz](https://github.com/google/oss-fuzz/tree/master/projects/markdown-it-py) testing, thanks to [@DavidKorczynski](https://github.com/DavidKorczynski). * 🔧 MAINTAIN: Make type checking strict [#](executablebooks/markdown-it-py#267) * 🔧 Add typing of rule functions [#283](executablebooks/markdown-it-py#283) * 🔧 Move linting from flake8 to ruff [#268](executablebooks/markdown-it-py#268) * 🧪 CI: Add fuzzing workflow for PRs [#262](executablebooks/markdown-it-py#262) * 🔧 Add tox env for fuzz testcase run [#263](executablebooks/markdown-it-py#263) * 🧪 Add OSS-Fuzz set up by @DavidKorczynski in [#255](executablebooks/markdown-it-py#255) * 🧪 Fix fuzzing test failures [#254](executablebooks/markdown-it-py#254)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The use of
StateBase.srcCharCode
is deprecated (with backward-compatibility), and all core uses are replaced byStateBase.src
.Conversion of source string characters to an integer representing the Unicode character is prevalent in the upstream JavaScript implementation, to improve performance.
However, it is unnecessary in Python and leads to harder to read code and performance deprecations (during the conversion in the
StateBase
initialisation).StateBase.srcCharCode
is no longer populated on initiation, but is left as an on-demand, cached property, to allow backward compatibility for plugins (deprecation warnings are emitted to identify where updates are required).isStrSpace
is supplied as a replacement forisSpace
, and similarlyStateBlock.skipCharsStr
/StateBlock.skipCharsStrBack
replaceStateBlock.skipChars
/StateBlock.skipCharsBack
Co-authored-by: Taneli Hukkinen [email protected]