Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

display automatic translations to informal math #168

Open
wants to merge 22 commits into
base: master
Choose a base branch
from

Conversation

robertylewis
Copy link
Member

@robertylewis robertylewis commented Jul 21, 2022

Thanks to work by @zhangir-azerbayev , we have a dataset of every declaration in (a recent version of) mathlib translated to natural language. People from OpenAI are interested in improving this; in principle we can regenerate these translations periodically, and update the model with user responses from the displayed translations.

There is a static json file containing the translations. It also stores the theorem name and statement that were used to generate the text. doc-gen checks that the statement hasn't changed before printing the translation, so even if we go months between updates, the worst we'll see is a lot of theorems with no translation.

All of this is WIP:

  • We need a doc page on the community website explaining what's going on here.
  • I did not try hard at all on the display stylilng. Obviously this needs to be improved. Right now I put it in a <details> menu, expanded by default to make browsing this demo easier. I think they should be collapsed by default in production, at least until the translations improve.
  • The response buttons don't go anywhere. We should have them pop up a window that allows the user, optionally, to correct the statement and submit. They can target a Google form or any other database.

@robertylewis
Copy link
Member Author

#deploy

@zhangir-azerbayev
Copy link

@EdAyers Would you be interested in helping us with UI and databases?

@github-actions
Copy link

This PR has been successfully deployed at http://leanprover-community.github.io/mathlib_docs_demo!

style.css Outdated Show resolved Hide resolved
style.css Outdated Show resolved Hide resolved
style.css Outdated Show resolved Hide resolved
@robertylewis
Copy link
Member Author

When the informal output includes a < in math mode, it seems to be picked up as HTML before being picked up by MathJax: https://leanprover-community.github.io/mathlib_docs_demo/data/nat/basic.html#nat.one_add_le_iff (notice that the full statement is in the source code).

I thought MathJax was smart enough to do this, it handles < in doc strings perfectly fine for instance. Do we need to manually escape this character?

@robertylewis
Copy link
Member Author

@EdAyers if you're interested in helping with the UI, I have an approximate setup for the "data collection" part that needs to be hooked into the interface. We can chat.

templates/decl.j2 Outdated Show resolved Hide resolved
Co-authored-by: Bryan Gin-ge Chen <[email protected]>
@robertylewis
Copy link
Member Author

#deploy

@github-actions
Copy link

This PR has been successfully deployed at http://leanprover-community.github.io/mathlib_docs_demo!

@eric-wieser
Copy link
Member

eric-wieser commented Jul 24, 2022

I thought MathJax was smart enough to do this, it handles < in doc strings perfectly fine for instance.

Do you have a docstring in mind that proves this?

@robertylewis
Copy link
Member Author

I thought MathJax was smart enough to do this, it handles < in doc strings perfectly fine for instance.

Do you have a docstring in mind that proves this?

https://leanprover-community.github.io/mathlib_docs_demo/model_theory/order.html#first_order.language.sentence.densely_ordered is an example.

@robertylewis
Copy link
Member Author

Even weirder, https://leanprover-community.github.io/mathlib_docs_demo/model_theory/order.html#first_order.language.term.realize_lt has both a working and broken occurence of <!

@robertylewis
Copy link
Member Author

I thought I just posted this (eaten by GitHub somehow?), but, the problem is < followed by a non-space token. In principle this is an issue in normal doc strings as well as the automatic ones.

This is nontrivial but possible to replace during processing. This must have come up in relation to mathjax before, but I'm not turning up anything quickly on Google somehow.

@eric-wieser
Copy link
Member

This is not a mathjax problem, but a problem in our markdown parser. Mathjax cannot be reasonably expected to parse <, because by the time it runs the browser has already tried to parse it and made a mess.

@eric-wieser
Copy link
Member

eric-wieser commented Jul 25, 2022

I was planning to switch out our parser for markdown-it, but am waiting on an upstream patch (executablebooks/mdit-py-plugins#46)

@robertylewis
Copy link
Member Author

#deploy

@github-actions
Copy link

github-actions bot commented Aug 2, 2022

This PR has been successfully deployed! You can find it at http://leanprover-community.github.io/mathlib_docs_demo in around 10 minutes, or watch the deployment progress by going to http://github.com/leanprover-community/mathlib_docs_demo

@robertylewis
Copy link
Member Author

#deploy

Comment on lines +41 to +43
<div class="statement">
{{ decl.informal_statement | convert_markdown }}
</div>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried the mathjax rendering is going to have quite a large impact on page load times. Could we maybe:

  • Defer mathjax rendering of .informal_statement until the details is expanded
  • Put a "expand informal math" setting in the sidebar next to the color scheme setting, so that users can opt to collapse everything by default and get the associated performance boost.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the details should be collapsed by default, not for performance but because the translations are misleadingly sketchy:

Right now I put it in a <details> menu, expanded by default to make browsing this demo easier. I think they should be collapsed by default in production, at least until the translations improve.

But a toggle also sounds like a good idea.

Do you know how to defer the rendering?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a (possibly outdated) question about deferring here: https://stackoverflow.com/a/27952213/102441

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This page in the documentation also seems relevant.

Enabling Lazy Typesetting would also be a reasonable option.

@github-actions
Copy link

github-actions bot commented Aug 3, 2022

This PR has been successfully deployed! You can find it at http://leanprover-community.github.io/mathlib_docs_demo in around 10 minutes, or watch the deployment progress by going to http://github.com/leanprover-community/mathlib_docs_demo

@EdAyers
Copy link
Member

EdAyers commented Aug 4, 2022

I reckon this is ready to merge once @zhangir-azerbayev has fixed truncated translations?

@robertylewis
Copy link
Member Author

I reckon this is ready to merge once @zhangir-azerbayev has fixed truncated translations?

I've seen very few truncated translations, and regenerating the translations is costly. I think it's fine to wait to fix this until we're ready for a "next round." But there are still a few more to-dos:

  • Eric's concerns above about delaying mathjax processing
  • A toggle to collapse/expand all translation boxes
  • A landing page on the website describing this project -- it's a broken link right now

nav.js Outdated Show resolved Hide resolved
nav.js Outdated Show resolved Hide resolved
nav.js Outdated Show resolved Hide resolved
nav.js Outdated Show resolved Hide resolved
nav.js Outdated Show resolved Hide resolved
nav.js Outdated Show resolved Hide resolved
nav.js Outdated Show resolved Hide resolved
@robertylewis
Copy link
Member Author

#deploy

@robertylewis
Copy link
Member Author

#deploy

@github-actions
Copy link

This PR has been successfully deployed! You can find it at http://leanprover-community.github.io/mathlib_docs_demo in around 10 minutes, or watch the deployment progress by going to http://github.com/leanprover-community/mathlib_docs_demo

@robertylewis
Copy link
Member Author

I think all of the to-dos are taken care of now (thanks @EdAyers !). I enabled lazy typesetting in mathjax, and confirmed it only processes when the display is unrolled.

@PatrickMassot
Copy link
Member

I think the warning should be more visible than using a light-grey color.

@eric-wieser
Copy link
Member

eric-wieser commented Nov 3, 2022

#deploy

(just so that we have something interesting at the docs_demo page)

Edit: Oops, did not mean to close the PR

@eric-wieser eric-wieser closed this Nov 3, 2022
@eric-wieser eric-wieser reopened this Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants