Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling DOIs #158

Closed
mjpost opened this issue Feb 28, 2019 · 3 comments
Closed

Handling DOIs #158

mjpost opened this issue Feb 28, 2019 · 3 comments
Labels
help wanted Interesting but beyond current volunteer bandwidth

Comments

@mjpost
Copy link
Member

mjpost commented Feb 28, 2019

We need to update the DOI process away from the Rails setup. This requires writing

  • a script to submit volumes and papers for DOI assignation
  • a scrip to take assigned DOIs and add them to the XML
@mjpost mjpost added this to the Static Rewrite milestone Feb 28, 2019
@mjpost mjpost removed this from the Static Rewrite milestone Mar 23, 2019
@mjpost mjpost added the help wanted Interesting but beyond current volunteer bandwidth label Apr 12, 2019
@mjpost mjpost pinned this issue May 9, 2019
@knmnyn
Copy link
Collaborator

knmnyn commented Jun 29, 2019

@mjpost, how do you want the Anthology XML to be modified? I can probably write a script to read in the XML using python3 and write it back out again with the DOI related children in the the appropriate place.

Can you point me to the proper outputting/formatting code for the XML output? I'm hoping whatever I write will retain the exact formatting of the current Anthology XML so as to not totally change lines by vacuous reformatting.

@mjpost
Copy link
Member Author

mjpost commented Jun 29, 2019

Hi @knmnyn—we just use your <doi> tag, with the DOI tag, e.g.,

<doi>10.18653/v1/P17-2091</doi>

An example script for reading and writing the XML is add_attachment.py. It takes an XML file to read, and another XML file to write (they can be the same path). You can see how it iterates through the papers in the file. If you call indent() on the root node at the end, it will be sure to create the correct indention (alternately, just make sure you set node.tail when you create the XML node—it should be \n (newline and four spaces).

@mjpost
Copy link
Member Author

mjpost commented Oct 1, 2019

This is done and documented here: https://github.com/acl-org/acl-anthology/wiki/Adding-DOIs

@mjpost mjpost closed this as completed Oct 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Interesting but beyond current volunteer bandwidth
Projects
None yet
Development

No branches or pull requests

2 participants