-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feedback on the new Anthology website #170
Comments
I really like it, especially the speed! There is a As a minor comment: Could you specify the hardware requirements for building the anthology a bit? How much time & memory does building take? "a considerable amount of memory" could be 8GB or 512, depending on whom you ask :-) |
It looks great! On Safari, when you click on on pdf/bib link and then click the browser's back button, the little callout ("Open PDF" or "Export BibTeX") remains on. |
awesome!!!!!!!!! |
I think it would look better if the header had the same width as the content. I.e., the ACL logo would move to the left and the search box to the right, in order to align with the content. |
Looks awesome! Great work! 👏 |
What's the reason for inserting newlines in the bib field values? (for example, in booktitle here, and titles elsewhere). |
Disclaimer: This is about search, but is not about weird search behavior as such. Is Google Custom Search the long-term search solution for the new version of the Anthology? It is inherently waaaaay less functional than the existing search system on the current Anthology- for example, the current search page has really great result faceting, etc. |
And I just saw #165 - glad to see that something more flexible is on the roadmap/radar. In the meantime, we could also link to the DFKI "ACL Anthology Searchbench". |
On mobile, the magnifying glass of the search bar gets forced to the next row for me. |
Is the BibTeX generation handling special characters properly? This entry has weird quotation marks in the abstract. http://aclweb.org/anthology/papers/C/C18/C18-1137.bib |
When there is just one paper in a conference, the noun after the number should be singular "paper" and not "papers". |
Awesome work! One small issue I saw is that when I am browsing through papers in pages like this, there is no way for me to scroll back to the top instantly. The |
Fixed by [6bbc5a1] |
Re: #170 (comment), when I view in Chrome or iOS Safari, I see mojibake, but on macOS Safari, it looks fine. Although @danielgildea's fix puts the .bib file into ASCII (as it should be), I wonder if, as a failsafe, can the server put |
|
I'm seeing "CoNLL–SIGMORPHON" in macOS Safari, instead of "CoNLL–SIGMORPHON". Does the build script need to be re-run to show the fix? |
Absolutely. Fixes are not reflected on the live website until @mjpost rebuilds it and pushes it there. |
I agree the one-line-per-author variant is more readable and is fine with me, as long as we make sure to use spaces and not tabs (per #16). I'll rebuild soon, by tonight at the latest. Once we have continuous integration checks built (#102) and other checks against commits to the master branch, we can have it automated. |
Thanks for all the feedback so far! I've implemented a bunch of minor layout fixes based on the comments here (with the same caveat as above: will not be live until Matt rebuilds).
I believe Google Custom Search is much more powerful than people give it credit for, and it offers customization options that should allow for similar result faceting and features as before. However, that requires some more work on my part, and it wasn't really possible to implement and test this earlier as, by its very nature, it requires the new site to be live and getting indexed by Google first. I'd really like to advocate for some more patience here over the coming weeks as I'm hoping to improve this. Maintaining a custom-made search solution is a huge liability IMO, and I would really like for people to give the Google version a fair chance first. |
@mbollmann That's totally fair, and thank you for the reply. I certainly see the value of using an off-the-shelf/hosted search platform in general, and also of using Google Custom Search in particular as a "getting things up and running" solution. For the sake of clarity, my concerns are less about the search behavior of GCS- if anybody can build a decent text search engine, it'd be Google! My concerns are more about search UI/UX- result faceting, etc. I'm happy to give GCS more of a chance, and am looking forward to seeing what we're able to do with GCS in terms of customization. Thank you (all of you!) for your efforts on this project; I do very much like the redesign overall and am excited to see it evolve! |
Okay, rebuilt. I also merged in master which had some corrections. |
Unclear whether this is a parsing error or a data error: this BibTeX has no article title. |
Thanks! The title appears in the HTML: (http://aclweb.org/anthology/D13-1088/) and is in the XML, so I'm not sure what's going on here. |
Pretty sure it's related somehow to the title starting with |
Ah, I was looking at the master branch. i’ll rebuild tonight. |
Done, and the problem is indeed fixed. Thanks! |
I can't read this PDF https://www.aclweb.org/anthology/W19-3604 |
I can't read this PDF https://www.aclweb.org/anthology/W19-3604
There is nothing in the page.
There is a PDF, but the PDF is empty. Maybe an error in the ingestion
process?
|
I open this PDF from previous page https://www.aclweb.org/anthology/papers/W/W19/W19-3604/ |
That is the PDF that the Widening NLP workshop gave us. There are many other empty (W19-3601 W19-3604 W19-3607 W19-3618 W19-3629 W19-3644 W19-3648) and improperly-formatted papers. |
Thanks, I am trying to find the WMT19's papers in the set, but I can not find them. |
They have not been ingested yet. Please see statmt.org/wmt19 where they are available. |
It seems that ACL 2019 tutorial abstracts have ACL 2017 publication information on their footer (see https://www.aclweb.org/anthology/P19-4#page=11 for example); is it intended? |
@jihunchoi Forgotten rsync, fixed, thank you! |
I want to commend you and thank you for putting so much thought into a naming system of papers and their related information, like bibtexs. This is by no means self-evident - and in fact I have seen it only at ACL. It is incredible, but you are the only people on the planet who name related resources with the same basename! This puts you light years ahead of your time! Let me explain: Suppose I look at the aclweb.org site and say to myself: "WOW! What an incredible treasure. I would love to download it all and have it in my local paper library for my reading pleasure!". Well, that's easy. I remember looking at it 10 years ago - or even further back in the past. It has always been easy to "get them all". But that's only part of the story. Having PDFs named like "pennington-etal-2014-glove.pdf" does not help at all - you must rename them to some naming scheme amenable to searching, e.g.: Venue Volume Issue Year DOI Authors Title For example, for the above paper: Proceedings 2014 Conference on Empirical Methods in Natural Language Processing EMNLP 2014 [doi 10.3115%2Fv1%2FD14-1162] Pennington, Jeffrey; Socher, Richard; Manning, Christopher -- Glove - Global Vectors for Word Representation.pdf Notice that you can reconstruct a basic bibtex from the above name, knowing that semicolons delimit author names, the tile comes after ' -- ', the DOI is the URL-encoded string XXX in '[doi XXX]', the year is the 4-digit string before the DOI part and "Venue" is before that. For this to work, you need bibliographic information for each paper, say in the form of a .bib file. You have that - everybody has that. But what you have - and everybody else is still missing - is this: The paper and its associated bibliographic information have the same basename! That is, if the above paper has a URL https://www.aclweb.org/anthology/D14-1162 then I know that the paper is at https://www.aclweb.org/anthology/D14-1162.pdf and its associated bibtex at https://www.aclweb.org/anthology/D14-1162.bib I can get those two just by looking at the 'url={...}' lines of the 'cumulative' bibtex at https://www.aclweb.org/anthology/anthology.bib.gz and as soon as I have two files, one PDF and one BIB, with the same basename D14-1162.pdf I know they are connected! It's so simple, but its impact is immense. Imagine you would have a PDF D14-1162.pdf but its bibtex would have a different basename, say pennington-etal-2014-glove.bib How on earth would you know they belong together? You would have to resort to web scraping: parse each proceedings HTML page and, for each PDF link on it, find the '.bib' HTML link that is visually 'nearest' to it. This is programming hell. Having a local paper collection, with papers renamed as above, makes searching (an issue that has been the subject of quite a few postings above) a dream: just list your local papers and pipe the list to a text file. Now use that text file as a "poor man's index" using, say, grep. You can grep it with any regular expression you like grep -E 'your regexp' index.txt If you rename your local papers as above, you will be amazed at what you can find by such a simple method! So thank you for making local collections possible with such genial ideas like providing a cumulative .bib file and using consistent names across the whole site for both papers and their bibliographic information. Forget OAI-PMH, federated repository aggregators and all that! All a truly open access paper repository needs is those two simple things! |
It's possible my brain is pudding right now, but is there a way to navigate to EMNLP Findings papers from the homepage of the ACL anthology? I see they're posted here: https://www.aclweb.org/anthology/volumes/2020.findings-emnlp/, but ctrl-F for "Findings" on the main page or the EMNLP page doesn't lead to any results. |
Hi @lucy3—it's not currently linked from the front page, but will be soon. |
A paper I have not authored was wrongly assigned to my ACL Anthology page because I have the exact same name as the first author on that paper. How do I get it removed from my profile? |
@Pranav-Goel Please open a new issue for that, and make sure to include the Anthology ID(s) of the paper(s) in question. We can disambiguate authors in the metadata then. If you have an academic website and/or an ORCID ID, feel free to include a link to it too, as it might help us with the disambiguation process. |
When search results come up for a keyword search it would be helpful to see the data of publication and the list of authors. Some of the PDFs don't have any dates in the footer. Also, would there be a way to subscribe to a certain search result and get email updates when new papers are posted that match? |
The form that is linked to in the side bar "The Anthology can archive your poster or presentation! Please submit them in PDF format by filling out this form." is not accessible anymore. |
Not sure if this is the right place where to ask these things, please redirect me if this is the wrong place:
|
ORCID: We do not have orcid data for authors, so currently not. See eg.g #1179 for WIP. |
Hi, first of all, thank you for your work on the ACL Anthology website --- it's amazing! My question is about the (lack of) Scopus indexing of my paper (https://aclanthology.org/2021.clpsych-1.22/), accepted to the Seventh Workshop on Computational Linguistics and Clinical Psychology, co-located with NAACL 2021. Right now, the paper is NOT indexed on Scopus, and upon further digging, I have found that neither the workshop itself (2021 occurrence) nor NAACL 2021 is Scopus-indexed, despite the fact that previous proceedings of both are indexed. I was wondering if you could help to make sure that my paper and the proceedings of the workshop and NAACL 2021 are indexed on Scopus? Without the indexing, my paper will not count towards my PhD degree. I have tried to contact [email protected] about this multiple times, but I have not heard from them. Thank you! Best regards, Zixiu Wu |
@zixiu-alex-wu Have you tried asking Scopus about this? I'm not aware of anything we do on our side related to indexing papers in other databases, and would be surprised if we had any control about this. |
We have been manually submitting proceedings here and there, as we find motivated volunteers to do so. I’m in the process of working with a volunteer to be more systematic about this, but it’s currently something we are not handling well. It is on our 2022 roadmap, however! |
Hi Marcel, thank you for your response! I have actually asked Scopus already, and they have an investigation underway, so I thought I'd ask the ACL anthology people about this as well. In fact, the workshop's organiser referred me to the chairs of NAACL 2021, who in turn referred me to "the ACL Anthology folks", because, as he put it, "they are maybe the only ones who would know the most recent year of NAACL is not indexed yet in Scopus". Once Scopus informs me of the results of their investigation, I will put an update here. Thank you again for your response! |
Hi Matt, Thank you for your response! So, if I am not mistaken, the proceedings of ACL conferences such as NAACL, as well as the proceedings of the co-locating workshops, have mostly been submitted to Scopus manually, which has resulted in the indexing of the proceedings of the conferences in previous years. In that case, I was wondering if you could perhaps arrange for the proceedings of the workshop in question (https://aclanthology.org/volumes/2021.clpsych-1/) as well as the proceedings of NAACL 2021 to be submitted to Scopus, so that both of them as well as my workshop paper (https://aclanthology.org/2021.clpsych-1.22/) would be indexed? Thank you! Best regards, Zixiu Wu |
Hi, does this apply only for the main ACL conferences or also for non-ACL events, such as COLING? |
The ACL can only assume responsibility for ACL events, unless some arrangement is made. |
Thank you! Since these other events are present in the ACL anthology as well I was not sure if they were managed independently or not. |
Closing this as the “new” website is now over 4 years old. Feedback is still welcome, but can go into more specific issues. |
This thread is intended to collect all feedback, suggestions, bug reports, etc. for the new Anthology website in the
static-rewrite
branch.(Edit: live demo here at http://aclweb.org/anthology)
If you do not have a GitHub account, you're also welcome to send me feedback via e-mail ([email protected]) or Twitter (@mmbollmann)!
Known Issues
The text was updated successfully, but these errors were encountered: