-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LaTeX processing is not being done on ingestion #644
Comments
Ah, yes, this must be it! I do not call Can I use |
Sorry, I didn't understand the last question... |
It looks like I should change this line, passing "latex" instead of "xml". Is that correct? [Edit: added the link] |
Right, with the caveat that LaTeX processing should not be done more than once. |
Ingest is called just once, so this is the perfect place for it. |
OK, and do you want to do anything about EMNLP 2019? |
Yes, will add to #645 (don't merge yet). |
Did abstracts, titles and others are trickier at this point, will require custom script. Is it worth it for me to do that? |
Maybe it's easier to eyeball all the titles. |
Just titles, though? Or anything else? (author names?) |
I think START sends us author names in UTF-8. |
I took a quick look at the D19 index page and only found A Label Informative Wide & Deep Classifier for Patents and Papers uniblock -> uniblock Answering Naturally : Factoid to Full length Answer Generation (not sure if this needs to be corrected) Cherry Colin | Durrett Greg | Foster George | Haffari Reza | Khadivi Shahram | Peng Nanyun | Ren Xiang | Swayamdipta Swabha (names are all backwards) I saw tons of author capitalization problems (#643). |
call normalize correctly (closes #644)
call normalize correctly (closes acl-org#644)
This came up in #628, and I think it appeared for the first time for EMNLP 2019 because we pushed some simplifying changes to START that turned off LaTeX on their end.
Is it because
normalize_anth.py
is not being run with the-t
option?The problem is that currently,
normalize_anth.py
cannot be rerun with the-t
option; all kinds of errors come up. It ought to be fixable but might not be an easy fix.The text was updated successfully, but these errors were encountered: