Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too much space around / #260

Open
physikerwelt opened this issue Sep 30, 2024 · 15 comments
Open

Too much space around / #260

physikerwelt opened this issue Sep 30, 2024 · 15 comments
Assignees
Labels
bug Something isn't working MathML Core

Comments

@physikerwelt
Copy link
Member

As feedback to the MathML rendering in Wikipedia, it was suggested it was suggested to replace <mo>/</mo> with <mo lspace="0" rspace="0">/</mo>. I wonder if this is specific to Wikipedia, or if it's a general problem.

Generalization: I think it should be better explained why the operator spacing was chosen.

Related: #141

@physikerwelt physikerwelt added bug Something isn't working MathML Core labels Sep 30, 2024
@physikerwelt physikerwelt self-assigned this Sep 30, 2024
@davidcarlisle
Copy link
Collaborator

In MathML 1,2,3 / had lspace and rspace = 1

https://www.w3.org/TR/MathML3/appendixc.html#oper-dict.entries-table

In 2020 this was changed in the drafts for mathml4 to 4 at the following commit

w3c/xml-entities@13d9f9f#diff-a0f0456c0fa161e1a8551ae929924a4c29b155b709947cae6dc079b07d377b0eL953

that is also in the draft mathml-core and implemented in current browsers.

TeX assigns 0 spacing to / and all versions of MathML assign lspace=rspace=0 to the less used but symmetric \ operator U+005C

The assignment of spacing for / to 4 was part of a general sweep correcting and normalising spacing, and this is the default for most infix operators, however / is traditionally set close: 1/2 as opposed to 1 + 2. As far as we can tell , / was not considered in detail and its assignment here may have been not completely intended.

In many ways the operator dictionary values are always somewhat arbitrary defaults and generators need to use the lspace and rspace attributes when generating if they need specific spacing, so just accepting this in this case and closing with no action is a viable option but..

@fred-wang would it be at all feasible to change this to lspace=rspace=1 to match previous releases (and possibly do same for \ ) Or is changing the spacing deployed in browsers too probematic at this stage?

@NSoiffer
Copy link
Contributor

@fred-wang certainly knows best, but it looks like there is not a current category for lspace = rspace = 1/18em and hence is a more complicated change. However, category K is infix/0em. Since TeX uses that for "/", it would seem like moving "/" to that category would be what is desired and I would think be easy to do.

@davidcarlisle
Copy link
Collaborator

@NSoiffer @fred-wang 0 would be Ok for me as well as you say it's what tex does, what mathml specifies for \ and also what Unicode's mathclass-15 says of /

002F | B | / | sol | ISONUM | No extra spacing, stretchy | SOLIDUS

@physikerwelt
Copy link
Member Author

In many ways the operator dictionary values are always somewhat arbitrary defaults and generators need to use the lspace and rspace attributes when generating if they need specific spacing

Is there a way to change that with CSS? So that the spacing for that character would be changed on the entire page. I would like to get rind of "magic" numbers within the MathML output as much as possible. Especially since the generator output would always generate <mo lspace="0" rspace="0">/</mo>.

@davidcarlisle
Copy link
Collaborator

davidcarlisle commented Oct 23, 2024

Is there a way to change that with CSS?

Essentially no. CSS can not select based on an element text content so you can not css style <mo>/</mo> differently to <mo>+</mo> you could of course style <mo class="noextraspace">/</mo> but if you can generate that you can generate <mo rspace="0" lspace="0">/</mo>

@NSoiffer
Copy link
Contributor

At our last meeting, Bert said that you can now do that and linked to https://developer.mozilla.org/en-US/docs/Web/CSS/:has. I looked at that page and the examples all show looking at tag names in content, but not text content. @bert-github can you clarify?

@davidcarlisle
Copy link
Collaborator

@NSoiffer Bert clarified the newish has selector let's you select based on child elements, but you can not select based on text content, so it doesn't help here

@NSoiffer
Copy link
Contributor

Glad to see that I wasn't missing something in the MDN page. Make sure to mention this at the meeting as it was an action item for you.

@fred-wang: still waiting for more info from you...

@physikerwelt
Copy link
Member Author

To summarize my understanding of the status.

The next step is to update the unicode.xml file. This is something @davidcarlisle can do. This will be picked up by the browsers MathML implementations eventually. In the meantime generators can overwrite lspace and rspace. Here I think a custom CSS class would make the HTML better understandable if a reasonable name was chosen.

@davidcarlisle
Copy link
Collaborator

@physikerwelt no there is no automatic path between the unicode.xml file changing and browser implementations changing. In an HTML5/mathml-core world browser implementations take priority and we should not specify something that will not be implemented. Hence the question above on the feasability of getting changes at that level. If that isn't an option, we shouldn't change unicode.xml

@physikerwelt
Copy link
Member Author

For some time-zone issues, it seems I have missed the dicussion. I was hoping that changings to unicode.xml would trickle-down to the browser implementations. Otherwise it seems like a chicken and egg problem. It's not that this is a request to implement a new feature, but only to change a value in a table. As per discussions above the rationality seems sensible, and if not us mathematicians, who could judge what the "right" spacing is? If there is any argument why a larger spacing might be better, we could discuss that, but to keep it as it is since browser vendors might not like it does not seem like an ideal solution to me.

@fred-wang
Copy link
Contributor

Moving to category K should be OK (although I'm not super happy we have to do yet another change everywhere...).

The problem is not about who decides between mathematicians Vs implementers. Actually, the values were suggested by people involved in the Math WG who are not browser implementers. But as I explained elsewhere, in order to implement things properly, we need to have something that is stable, interoperable, well tested and efficient, which is what the MathML Core dictionary tries to achieve. If we continue to change values in the dictionary regularly each time there is a report by MathML users, then that's not only a pain for implementers but also web developers can't rely on the dictionary and should then just use explicit attributes everywhere, defeating the purpose of the dictionary.

#218 is about mapping more mo attributes to style, which I believe we should eventually do to make MathML more CSS-compatible. Obviously this won't work for tools that are not MathML-aware.

@physikerwelt
Copy link
Member Author

physikerwelt commented Oct 29, 2024

But as I explained elsewhere, in order to implement things properly, we need to have something that is stable, interoperable, well tested and efficient, which is what the MathML Core dictionary tries to achieve.

^^ I could not agree more. However, as the TexZilla creator, you are likely aware of people wanting MathML to look like TeX, and often, this is not a bad idea. Therefore, I expect that things will continue changing for a while. It's a long route from TeX to MathML, and I think an implementation guide would help to discover the right abstractions to identify the issues faster.

Too much space around / is not the right abstraction and is too detailed from my point of view. However, a process should be defined to update tables with values like spacings. Whether that's via PRs or release cycles can be defined, but I think we need static code paths but editable tables with values.

I created this issue because I thought it would be better to solve the problem in the spec rather than on the converter side. If there would be an agreement that all converters use certain data tables, e.g., the TeX symbol '/' corresponds to <mo rspace="0" lspace="0">/</mo> and others that write MathML from other sources (e.g., Word-like software) have good reasons to use more space that would also be fined. However, working mathematicians seem to be annoyed if the same TeX formula is displayed differently on arXiv, MathOverflow, and Wikipedia.

@davidcarlisle
Copy link
Collaborator

@fred-wang

Moving to category K should be OK

As someone pointed out in the meeting K is in fact not ideal as it would remove the default stretchy=true. there isn't an existing
category for stretchy with zero space so an ideal outcome would be a new catagory for that and just containing \ and /

although I'm not super happy we have to do yet another change everywhere...).

Yes agreed, the intention is that these are stable but this is essentially a bug fix for a bug introduced since MathML3

In general the operator dictionary entries were greatly improved and standardised at the start of the mathml4/mathml-core round of work but \ got missed as stretchy and / got over-normalised as an infix operator and given infix space like +

So puttting / in K would make it match \ and fix the horizontal space so be an improvement but making them both stretchy
would be better still, but need a new category.

I'll make a branch with the change in the XML source and re-run the python scripts to generate the mathml-core versions of the data. Hopefully you could see how feasible either of these changes would be. Either way I think on principle we have to defer to implementations here. mathml-core should standardise what's implemented not what we should have specified in an ideal world.....

For comparison, given TeX

$$ /\backslash \rightarrow \left/ \frac{a}{b} \right\backslash $$

mathjax as running here on github stretches / but not \

$$ /\backslash \rightarrow \left/ \frac{a}{b} \right\backslash$$

the default TeX computer modern stretches them both:

image

@physikerwelt

However, a process should be defined to update tables with values like spacings. Whether that's via PRs or release cycles can be defined, but I think we need static code paths but editable tables with values.

In the MathML3 cycle that would have been the model, but (following the lead of HTML(5)) for mathml-core some things are less declarative and less under the control of the working group. It is similar to the situation with entity names (which are derived from the same source file) In an SGML/XML world changing an entity definition has no implementation costs (just some compatibility costs) as the definitions are via DTD declarations read at run time by each system. Getting the entities into HTML is on the whole a good thing but means giving up on a declarative table that may be modified. The names are part of the HTML parser definition and changing them now would be prohibitively expensive.

@NSoiffer
Copy link
Contributor

The fact that the operator dictionary has / with spacing=4 is a bug, very likely introduced by me during the big cleanup. As @davidcarlisle said, it has anomalous spacing wrt to other binary operators and I missed that. It's a huge table (almost 1200 entries) and unfortunately that means bugs are likely. In this case, it is for a relatively common character so the fact that it is wrong has higher impact than if ⊛ U+229B was wrong.

There have been and are likely to be parts besides the operator dictionary in the spec that need changing. Those changes will likely affect some or all implementations. That's just an unfortunate part of the process.

Having just looked at the spec, I think category K is fine. The spec currently says this for /:

/ U+002F | block | infix | 0.2222222222222222em | 0.2222222222222222em | N/A

That "N/A" means it does not stretch by default. So K is compatible with the current definition in terms of being default not stretchy. It does not mean that one can't add stretchy='true' though.

There are instances where the default of "stretchy" would not be good, so default of stretchy='false' is somewhat defensible. The default in TeX is not to stretch unless the user specifies stretching. It's not a great comparison though because the default in TeX is not to stretch anything unless the user says to. MathML differs from TeX in that respect. But at least that means many users will not be surprised to see that / doesn't stretch. They would be surprised to see the extra spacing though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working MathML Core
Projects
None yet
Development

No branches or pull requests

4 participants