-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nested volumes and explicit <url> tags #324
Conversation
Another issue has come up. We now need to generate explicit volume |
Can you do a squash-merge to keep the changes more local? I am a big fan of rewriting history (in git at least ...) so that it is easier to see where a feature was introduced. That is just my personal opinion of course. |
Yes, will do a squash merge. This is a good idea. @acl-org/anthology, this is ready for review. Here's a summary of changes:
Enabled, but yet to do:
|
I want to build this locally and have a closer look at it later today. |
@mbollmann: To have a look at the results (rather than the source), build master, |
There is now a benoit_sagot and benoit_sagot1:
Maybe it was there before and now fixed in master but not yet in this PR? |
@akoehn this kind of diffing is what I did. I did bulk diffs on I have merged in |
Good catch. I think this was fixed in master, though. The issue is that this uses a dotless i and Unicode normalization doesn't know how to deal with that. I don't think there's anywhere in the code that fixes this automatically, so I'll create a new issue for this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty great overall, but also quite hard to diff :)
I focused on the code & templates now rather than the XML, and have some (minor?) comments.
Congratulations on this huge merge! |
A summary of changes: - Introduces a nested format (closes acl-org#317) - URLs are stored using a relative format for internal links (closes acl-org#156), which facilitates mirroring (acl-org#295) - URLs are only displayed if they are found in the XML. I manually crawled to validate and create entries for PDFs for all frontmatter entries (closes acl-org#181 closes acl-org#180), including journal frontmatter (acl-org#264) and volume PDFs (closes #31) - Added missing entries and removed ones whose PDFs were missing, including LREC 2014 (closes #31 ) - It punts on C69 reformatting (closes acl-org#147) Relevant, but not completed: - Creating PDF volumes by pasting together individual papers (acl-org#226) - This makes it much easier to add non-paper entries such as talks (acl-org#298), to add a volume-level "publication date (acl-org#319), and to create an RSS feed of updates (acl-org#358),
Started consolidating URLs per #156. The goal is to:
href
attributes to<url>
tags<href>
tags to<url>
tagsP18-1014
instead ofhttps://www.aclweb.org/anthology/P18-1014
)Adding explicit
<url>
tags is easier with nested volumes (#317), so I am doing that, too:<url>
tags for volumes that are present on the serverAdditionally I think we should:
<bibtype>
and<bibkey>
tags, which are just clutterThis will help with mirroring (#295 #28 #22).
Finally, I'd like to update the code not to generate URLs for papers without a URL tag, so by cross-checking against #264, we can solve a lot of problems (#226 #181 #180).