Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalized the Codebraid support to MkDocs #154

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

reenberg
Copy link

@reenberg reenberg commented Dec 22, 2023

In #57 support for Codebraid syntax was added, which essentially is just Pandoc attribute syntax, but with a specific class attribute added.

The support was added as an extra identifier in the list of languages, for which Codebraid has support, such as for python: \\{\\.python.+?\\}.

The below example would give the following scope: "text.html.markdown markup.fenced_code.block.markdown fenced_code.block.language.markdown" to the entire line:

```{.python .cb.nb jupyter_kernel=python3}
```

However the "language scope" should only be given to the "python" part, and the current support doesn't allow spaces between the curly braces, and it lacks support for all languages.

MkDocs allows a few ways to annotate fenced code blocks, but if additional classes, id or key/value pairs are used, then the curly braces must be used and the language must be prefixed with a dot. In simple cases where only the language is specified, then the curly braces and the dot may be omitted. The following are quick examples:

``` { .python #id .class title="My Title"}
```

or

``` python
```

This change removes the Codebraid support from the specific languages as an identifier attribute, and moved into the RegEx by defining it as two alternative cases: surrounded by curly braces or allowing them after the language:

  1. The case where the entire line after the code fence is wrapped in curly braces. In this case the curly braces is not part of the language and attribute scope.
  2. The case where the attributes follows the language specification in all sorts of ways (I'm specifically thinking of you Gatsby Adding attributes to fenced markdown code blocks breaks syntax highlighting #62). In this case the curly braces are included in the attribute scope as it is not trivial to handle all the various ways it may be used, and since this is the current behavior.

@microsoft-github-policy-service agree

Closes #153
Refs: https://github.com/Python-Markdown/markdown/blob/master/docs/extensions/fenced_code_blocks.md

In microsoft#57 support for Codebraid syntax was added, which essentially is just
Pandoc attribute syntax, but with a specific class attribute added.

The support was added as an extra `identifier` in the list of languages,
for which Codebraid has support, such as for python:
`\\{\\.python.+?\\}`.

The below example would give the following scope: "text.html.markdown
markup.fenced_code.block.markdown fenced_code.block.language.markdown"
to the entire line:

```{.python .cb.nb jupyter_kernel=python3}
```

However the "language scope" should only be given to the "python" part,
and the current support doesn't allow spaces between the curly braces,
and it lacks support for all languages.

MkDocs allows a few ways to annotate fenced code blocks, but if
additional classes, id or key/value pairs are used, then the curly
braces must be used and the language must be prefixed with a dot.  In
simple cases where only the language is specified, then the curly braces
and the dot may be omitted.  The following are quick examples:

``` { .python #id .class title="My Title"}
```

or

``` python
```

This change removes the Codebraid support from the specific languages as
an `identifier` attribute, and moved into the RegEx by defining it as
two alternative cases: surrounded by curly braces or allowing them after
the language:

1. The case where the entire line after the code fence is wrapped in
   curly braces.  In this case the curly braces is not part of the
   language and attribute scope.
2. The case where the attributes follows the language specification in
   all sorts of ways (I'm specifically thinking of you Gatsby microsoft#62).  In
   this case the curly braces are included in the attribute scope as it
   is not trivial to handle all the various ways it may be used, and
   since this is the current behavior.

@microsoft-github-policy-service agree

Closes microsoft#153
Refs: https://github.com/Python-Markdown/markdown/blob/master/docs/extensions/fenced_code_blocks.md
@reenberg
Copy link
Author

@microsoft-github-policy-service agree

@reenberg
Copy link
Author

It would seem that this PR also has the side effect of fixing the broken Codebraid support for Rust, which was mistakenly matched as R code. This is most of the changes in the file pr-57_md.json. The rest are basically just updates to the fenced block not correctly scoping the language and attributes.

@reenberg
Copy link
Author

@mjbvz, @alexdima, any chance this can receive a review and be moved along?

@mjbvz mjbvz added this to the March 2024 milestone Feb 21, 2024
@rzhao271
Copy link

Moving the milestone
Also, there are merge conflicts

@rzhao271 rzhao271 modified the milestones: March 2024, April 2024 Mar 28, 2024
@lramos15 lramos15 modified the milestones: April 2024, Backlog Apr 26, 2024
@@ -86,7 +86,7 @@ const fencedCodeBlockDefinition = (name, identifiers, sourceScope, language, add

return `fenced_code_block_${name}:
begin:
(^|\\G)(\\s*)(\`{3,}|~{3,})\\s*(?i:(${identifiers.join('|')})((\\s+|:|,|\\{|\\?)[^\`]*)?$)
(^|\\G)(\\s*)([\`~]{3,})\\s*(?i:(?:\\{\\s*\\.?(${identifiers.join('|')})(?:\\}|\\s+([^\`\\r\\n]*?)?\\s*\\}))|(?:\\.?(\\g<4>)((?:\\s+|:|,|\\{|\\?)[^\`\\r\\n]*?)?))$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we break this up into a multiline regular expression. It was already a bit long but now it's unreadable

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ill see if I can find time to setup another dev environment to properly test out the changes that this will include.

@reenberg
Copy link
Author

reenberg commented Jun 4, 2024

Moving the milestone Also, there are merge conflicts

There would have been no conflicts if this had not been ignored, until way after a MS employee merged other changes (#158), ignoring all other PRs!

I will try and see if I can find the time. But its about 5months since i invested time into understanding this set of fairly complex expressions. not to mention setting up a dev environment to test this, in order to implement the suggested changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support MkDocs fenced codeblock attributes and dot prefixed language
4 participants