MD -> LaTeX: `gather` is not properly generated #974

LecrisUT · 2024-03-13T12:08:51Z

Description

Consider the following document:

Some introduction text

\begin{gather}
    E=mc^2
\end{gather}

The issue is that the LaTeX document that it generates becomes:

Some introduction text

\begin{equation}
\begin{gather}
    E=mc^2
\end{gather}
\end{equation}

This should not be surrounded by equation block

Proposed solution

Distinguish which environments need to be within a math environment, e.g. gathered vs gather
Distinguish which environments handle numbering gathered vs gather
Distinguish when we want something to be numbered and not gather vs gather*

Then conditionally add either equation or equation* or nothing to the LaTeX document

The text was updated successfully, but these errors were encountered:

agoose77 · 2024-03-13T12:59:24Z

The document that you are rendering is built in two phases:

Parsing the LaTeX into our MDAST
Writing the MDAST into LaTeX

Looking at the writer, we acknowledge that we don't yet handle AMS math. I did wonder whether we should flag the AST to indicate that the content is ASMMath, but that would probably require annotating the .data field, or adding a container.

So, my guess is we might want to just identify AMSMath from the content of the node. In the writer, this would probably be a simple as testing for \\\\\{\w+\*?\} at the start of the content. Would you be interested in adding this? I could walk you through it if you've not contributed before (but I do recognise your handle, so perhaps you have).

rowanc1 · 2024-03-13T13:07:59Z

There is a TODO in the writer here:
https://github.com/executablebooks/mystjs/blob/main/packages/myst-to-tex/src/math.ts#L50

Depending on how deep this goes, we might need to have an update to the tokenizer:
https://github.com/executablebooks/markdown-it-amsmath

LecrisUT · 2024-03-13T13:25:40Z

a simple as testing for \\\\\{\w+\*?\} It would be better to be more explicit with the \w+ and handle AMSMath specifics, unless we want to pass all \begin/\end environments as is? I am not sure how that would work on the site generation side.

Would you be interested in adding this? I could walk you through it

Sure, some help on this would be good, since this is a show-stopper for trying out mystmd

Depending on how deep this goes, we might need to have an update to the tokenizer:
https://github.com/executablebooks/markdown-it-amsmath

That does seem reasonable. I was thinking for this one, to add another MATH_HANDLER for ams math which just passes it down as is. It would be tricky to workout the label, but maybe we can regex the content and register any \label{*} command we see inside it?

I think the issue here is that \begin{amsmath_env} are implicitly wrapped by :::{math}

agoose77 · 2024-03-13T13:34:48Z

If you open a PR, we can collaborate on it.

It would be better to be more explicit

I think we can assume that the contents of the math node are determined by the tokenizer/parser phase, so if we see an environment of any kind it has already been "validated". As such, we just want to re-use the existing environment rather than wrap in a math environment.

The question will be what to do about the label and enumerated options that the user can set on the AST node via directive options (i.e. for MyST sources, not LaTeX). There may be some merit in parsing the environment and modifying it if the user has e.g. written

:::{math}
:enumerated: true

\\begin{gather*}
...
\\end{gather*}
:::

Because these options are only set if the user explicitly creates a math environment in MyST markdown, and specifies them.

We could alternatively ignore them, but I'm somewhat against that.

LecrisUT · 2024-03-13T13:46:17Z

:::{math}
:enumerated: true

\\begin{gather*}
...
\\end{gather*}
:::

I would say if the user combines math and amsmath it is ill-defined. Do they want the labeling to be as subequations, do they want one to have priority over the other, etc. We should throw an error in those situations and ask them to review it.

Caveat with gathered and friends that are meant to be used inside an already defined math environment. There the intent is well defined and we should just skip it. Should ensure that :::{math} remains a single line math for those though.

If you open a PR, we can collaborate on it.

As soon as I have any commit to open it with :D.

I think we can assume that the contents of the math node are determined by the tokenizer/parser phase

Thanks for the pointer. That would still be in the myst-to-tex? I went through a search of gather and couldn't find anything to tell me where that parsing occurs

agoose77 · 2024-03-13T13:50:54Z

That would still be in the myst-to-tex?

That should be tex-to-myst!

rowanc1 · 2024-03-13T14:00:57Z

For some background/info on a few of the code-paths at play:

When parsing a tex file (no markdown):

tex-to-myst --> AST --> transformations --> myst-to-tex

When parsing tex AMS environment in a markdown file:

markdown-it-amsmath --> myst-parser --> AST --> transformations --> myst-to-tex

I think that in this case you are likely writing in markdown? In which case tex-to-myst is never involved.

For the labels, we parse those out of the equations so that we can do cross-references, etc. in myst. There is also some handling for sub-equations, with each of their own labels. All of this is done in a transformation on the math node in the AST.

LecrisUT · 2024-03-13T14:20:02Z

Hmm, so if markdown-it-amsmath is used, then it is quite a red-herring. It seems that the main point is where does it choose to use the math handler/where does it inject the math block

agoose77 · 2024-03-13T14:43:01Z

@LecrisUT I just had a meeting with @rowanc1 where it became clear I think I misunderstood the document type that you're using; I'd thought you were using LaTeX -> LaTeX, but it seems like you're actually reading MyST -> LaTeX. I'm just working on some other PRs at the moment, but I'll circle back around to this shortly.

agoose77 · 2024-03-13T14:56:16Z

It seems to me that we want to include the environment in the math AST node. As such, we also want to support the environment in the math directive e.g.

:::{math}
\begin{gather}
...
\end{gather}
:::

We already strip out the "basic" environments in https://github.com/executablebooks/mystmd/blob/974dde179c40ff57908b55925baf9212455a16bf/packages/myst-transforms/src/math.ts#L251

I believe we should leave the AMSMath environments because there's no strong push factor to use an alternative. We could lift information about the environment to a new attribute of the math node, but I don't think there's a strong benefit, and presently it would probably complicate things more. It's really a question of "what is math".

My proposal is do the simple thing now, and we can "fix" it in future if we realise it was the wrong call.

LecrisUT · 2024-03-13T15:13:58Z

Hmm, what about for a temporary fix, inside
https://github.com/executablebooks/mystmd/blob/974dde179c40ff57908b55925baf9212455a16bf/packages/myst-to-tex/src/math.ts#L49-L54
We add a quick check for \\begin\{(gather\*?|align\*?)\} and if that passes, we simply print the content as it?

For a more proper call, the node should tell us that it is a asmmath type and what environment it's from, but I'm not sure where that would have to go

agoose77 · 2024-03-13T15:18:59Z

Yes, there are two questions here:

Short-term fix?
Long-term changes?

I think we should look to fix this as-is, and discuss longer-term AST changes in a new issue :)

I think we should try and strip the ASMMath environment so that we can add labels. This makes things more complicated, but is the "right" thing to do imo.

LecrisUT · 2024-03-13T15:21:48Z

I think we should try and strip the ASMMath environment so that we can add labels. This makes things more complicated, but is the "right" thing to do imo.

Wait, I thought the opposite, i.e. strip the equation wrapper when asmmath is detected, and give an error if label is added there just in case. The issue is equation is always a single-line math environment, while gather/align are multi-line

agoose77 · 2024-03-13T15:28:02Z

The issue is equation is always a single-line math environment, while gather/align are multi-line

That's a good point! Hmm. Maybe the proper solution is then to properly parse AMSMath environments into multiple math nodes. But, that's a complex fix. So, for now, let's just pass-through the environment as-is, as you suggest!

rowanc1 · 2024-03-13T17:15:29Z

This note is a bit off-topic (in that we should pursue the simple approach suggested in the PR), however, I want to explain some of the previous thinking/work we are trying to put in this direction.

There are some changes we made to parsing latex subequations and for align environments to allow these to be picked up and translated to both JATS (used for archiving scientific documents) and the HTML view with basic rendering and cross-references. The challenge in representing this in MyST (and in JATS) is that those environments need to be in charge of the equation labelling and numbering (with specific nodes/attributes for the cross references), and this fights with latex (or at least needs to share understanding). There is actually no way to do a multi-line, labeled subequations in JATS for example, and the way people do it with an disp-formula-group completely destroys the alignment that is the goal of the align and gather groups in the first place. So we are stuck somewhere between this:

and this:

The way that I am thinking that we get around that is actually by holding both of these variants in the tree, and we introduced a mathGroup (similar to a citeGroup) that has the raw latex to be rendered as well as the children nodes. Right now that mathGroup only renders the children/numbers, but we should update the UI to show the original with the labels they will need to be generated by katex rather than myst (then updating the enumerators with a post html transform or css perhaps to match the numbering). A start in that direction which works from tex-->jats is shown in a test here. I think the combination of these with the fixes suggested above should allow us to fully support both the semantics and the visual-alignment.

LecrisUT · 2024-03-13T17:55:05Z

Could mathGroup behave like admonitions in that on the markdown side it's a container like the \begin{} environments that we can pass specific class: align? Something like

:::{mathGroup}
:class: align

x + y & \geq 0

x - y & \leq 4

x \times y & \geq 1

x^y + y^x &\geq 1
:::

would generate

\begin{align}
x + y & \geq 0 \\
x - y & \leq 4 \\
x \times y & \geq 1 \\
x^y + y^x &\geq 1
\end{align}

Not sure how to deal with label in that case either unless we just allow them to be defined in there as well and just parse the contents of it from TeX and back.

rowanc1 · 2024-03-15T16:44:14Z

I moved my comment to #992 to continue the conversation as #977 will close this issue.

LecrisUT added the bug Something isn't working label Mar 13, 2024

LecrisUT mentioned this issue Mar 13, 2024

Cannot pip install editable #975

Closed

LecrisUT mentioned this issue Mar 13, 2024

🧮 Fix AMSMath environment for LaTeX #977

Merged

LecrisUT changed the title ~~LaTeX: gather is not properly generated~~ MD -> LaTeX: gather is not properly generated Mar 13, 2024

rowanc1 closed this as completed in #977 Mar 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MD -> LaTeX: `gather` is not properly generated #974

MD -> LaTeX: `gather` is not properly generated #974

LecrisUT commented Mar 13, 2024

agoose77 commented Mar 13, 2024

rowanc1 commented Mar 13, 2024

LecrisUT commented Mar 13, 2024 •

edited

Loading

agoose77 commented Mar 13, 2024 •

edited

Loading

LecrisUT commented Mar 13, 2024

agoose77 commented Mar 13, 2024

rowanc1 commented Mar 13, 2024

LecrisUT commented Mar 13, 2024

agoose77 commented Mar 13, 2024

agoose77 commented Mar 13, 2024

LecrisUT commented Mar 13, 2024 •

edited

Loading

agoose77 commented Mar 13, 2024 •

edited

Loading

LecrisUT commented Mar 13, 2024

agoose77 commented Mar 13, 2024

rowanc1 commented Mar 13, 2024

LecrisUT commented Mar 13, 2024

rowanc1 commented Mar 15, 2024

MD -> LaTeX: gather is not properly generated #974

MD -> LaTeX: gather is not properly generated #974

Comments

LecrisUT commented Mar 13, 2024

Description

Proposed solution

agoose77 commented Mar 13, 2024

rowanc1 commented Mar 13, 2024

LecrisUT commented Mar 13, 2024 • edited Loading

agoose77 commented Mar 13, 2024 • edited Loading

LecrisUT commented Mar 13, 2024

agoose77 commented Mar 13, 2024

rowanc1 commented Mar 13, 2024

LecrisUT commented Mar 13, 2024

agoose77 commented Mar 13, 2024

agoose77 commented Mar 13, 2024

LecrisUT commented Mar 13, 2024 • edited Loading

agoose77 commented Mar 13, 2024 • edited Loading

LecrisUT commented Mar 13, 2024

agoose77 commented Mar 13, 2024

rowanc1 commented Mar 13, 2024

LecrisUT commented Mar 13, 2024

rowanc1 commented Mar 15, 2024

MD -> LaTeX: `gather` is not properly generated #974

MD -> LaTeX: `gather` is not properly generated #974

LecrisUT commented Mar 13, 2024 •

edited

Loading

agoose77 commented Mar 13, 2024 •

edited

Loading

LecrisUT commented Mar 13, 2024 •

edited

Loading

agoose77 commented Mar 13, 2024 •

edited

Loading