Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling retractions #987

Merged
merged 36 commits into from
Aug 25, 2021
Merged

Handling retractions #987

merged 36 commits into from
Aug 25, 2021

Conversation

mjpost
Copy link
Member

@mjpost mjpost commented Sep 1, 2020

Following up on the discussion here, this PR implements retractions. It:

  • Adds a <retracted/> tag to mark the paper status
  • Processes the PDF explanation as a revision, so that it gets returned by default when accessing the PDF
  • Strikes the paper in the volume and event listings, and suppresses it entirely from the author listing

Comments welcome!

@mjpost mjpost requested a review from a team September 1, 2020 21:59
@mjpost
Copy link
Member Author

mjpost commented Sep 2, 2020

Here are a few screenshots.

Volume listing:
image

Paper page:
image

Author page (absent):
image

@aryamccarthy
Copy link
Member

Me gusta. Will we still serve BibTeX (and MODS, etc.) for retracted papers?

@mjpost
Copy link
Member Author

mjpost commented Sep 2, 2020

Hmm, good thought. The ACM policy doesn't have anything to say about this directly. At the very least, perhaps we should add "[RETRACTED]" to the title. But since the purpose is to effectively remove the paper from the body of scientific literature, it seems fitting to me to remove citation formats, too.

@aryamccarthy
Copy link
Member

My thinking was similar. The one argument against is if someone wants to cite X article as retracted, e.g. in some meta-analysis of retracted papers.

@mjpost
Copy link
Member Author

mjpost commented Sep 2, 2020

In that situation, they'd just have to refer to it by title, or URL. They'd need a custom bib format anyway, since it is no longer part of its original proceedings.

@akoehn
Copy link
Member

akoehn commented Sep 2, 2020

Looks good, the only thing that is bothering me a bit is that the retraction notice is linked as "PDF (v2)" and not as "Retraction notice". The second text is a bit to long OTOH, so I am okay with "PDF (v2)".

Maybe also change "can be found [here]" to "can be found [in this retraction notice]" because otherwise people could assume that "here" just links to a page about retractions in general.

@davidweichiang
Copy link
Collaborator

I feel that this whole issue is something that should be run by the ACL Exec.

@mjpost
Copy link
Member Author

mjpost commented Sep 2, 2020

This is a good idea. I will write up a policy and incorporate our plans here for Exec approval.

@akoehn, on the technical side, it is tricky to change the name of just the last revision in the template. An alternative:

  • Modify the <retracted> tag to work more like <revision>, taking a date and the file name, e.g., "2020.acl-main.563.retracted.pdf"
  • This file will also be copied over top of the main file, aux revisions
  • Then I can add a button labeled "Retraction Notice" or something to this, independent of the revisions

@mbollmann
Copy link
Member

Is it desired that the retracted paper itself is still accessible? In your approach it would be (as "v1"), no? That is not currently the case with existing (yet unlinked) retractions on the server.

@mjpost
Copy link
Member Author

mjpost commented Sep 3, 2020

Under the ACM policy, which I think makes sense, the original retracted paper remains accessible.

I think handling the back-catalog of (suspected or confirmed) retractions is a separate issue. I don't know that we'd go through the effort of bringing them into line with the new policy we adopt.

@mjpost mjpost marked this pull request as draft September 11, 2020 17:05
@mjpost
Copy link
Member Author

mjpost commented Nov 13, 2020

This has come up again with EMNLP. I am making this one of my agenda items for the next ACL meeting, but in the meantime, this needs to be dealt with, and I don't think we're doing any permanent harm while we wait for a formal policy. Everything here can be altered.

After consulting the arXiv withdrawal policy, I propose to do the following:

  • Display the paper as above with strikeouts
  • Strike out in all paper listings
  • Remove from author listing
  • Add a <withdrawn> tag marking the date
  • Process the paper as a revision. The revision will be the previous version with a retraction notice stamped on the top
  • Continue to serve citation materials No longer serve citation materials

@mjpost
Copy link
Member Author

mjpost commented Nov 13, 2020

Updated to use a dummy example

I've written some code to produce a watermarked version of the file. It creates a revision of just the first page, watermarked, and a link at the top to the paper page.

image

@mjpost mjpost marked this pull request as ready for review November 13, 2020 19:18
@github-actions
Copy link

Build successful. You can preview it here: https://preview.aclanthology.org/retractions
This preview will be removed when the branch is merged.

Copy link
Member

@mbollmann mbollmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM now, but I noticed that on the preview branch, only the 2020.acl-main.563v2.pdf carried the watermark, while 2020.acl-main.563.pdf (without versioning) was still identical to the v1. Not sure if this is something you may need to handle explicitly in your retraction script @mjpost.

@mbollmann
Copy link
Member

mbollmann commented Jul 5, 2021

@akoehn
Copy link
Member

akoehn commented Jul 6, 2021

A bit late, but this looks very good indeed.

@mjpost as you probably want to update the retraction script anyway: currently it assumes that there is only one version and will create a "v2" even when a v2 is already there. This will probably be a very rare occurrence, but this also means the person using that script will probably not have this peculiarity in mind.

@mjpost
Copy link
Member Author

mjpost commented Jul 6, 2021

Thanks! I'm on vacation but will update the script when I return and then merge this.

@mjpost mjpost requested a review from xinru1414 August 25, 2021 16:20
@mjpost
Copy link
Member Author

mjpost commented Aug 25, 2021

This has been approved by the ACL Exec. I plan to update the documentation today and would then like to merge in short order, if anyone has time to review.

@mjpost mjpost marked this pull request as ready for review August 25, 2021 16:21
* add_revision is now better factored, and is used by retract_paper
* also added some utility scripts for getting the XML file or
  PDF directory for an Anthology ID
@mjpost
Copy link
Member Author

mjpost commented Aug 25, 2021

@akoehn my latest push addresses your concerns (the revision piece is now automatically handled).

@mjpost mjpost merged commit 913637e into master Aug 25, 2021
@mjpost mjpost deleted the retractions branch August 25, 2021 22:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants