Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata change on Anthology ID D19-5108 #628

Closed
muonsei opened this issue Nov 6, 2019 · 5 comments
Closed

Metadata change on Anthology ID D19-5108 #628

muonsei opened this issue Nov 6, 2019 · 5 comments

Comments

@muonsei
Copy link

muonsei commented Nov 6, 2019

Anthology ID D19-5108

Please change the title to: Annotation Process for the Dialog Act Classification of a Taglish E-commerce Q&A Corpus

The backslash before the ampersand should be removed from the title.

@mjpost
Copy link
Member

mjpost commented Nov 6, 2019

Can you please find the line in data/xml/D19.xml and submit a PR?

@davidweichiang
Copy link
Collaborator

Why didn't latex_to_unicode.py get this? Was it double-backslashed in the .bib file?

@mjpost
Copy link
Member

mjpost commented Nov 7, 2019

Does ACLPUB produce the XML from bib files? (I still don't know ACLPUB as well as I should) Here is the bib that was in the same directory:

@InProceedings{rivera-EtAl:2019:D19-51,
  author    = {Rivera, Jared  and  Pensica, Jan Caleb Oliver  and  Valenzuela, Jolene  and  Secuya, Alfonso  and  Cheng, Charibeth},
  title     = {Annotation Process for the Dialog Act Classification of a Taglish E-commerce Q\&A Corpus},
  booktitle = {Proceedings of the Second Workshop on Economics and Natural Language Processing},
  month     = {November},
  year      = {2019},
  address   = {Hong Kong},
  publisher = {Association for Computational Linguistics},
  pages     = {61--68},
  abstract  = {With conversational agents or chatbots making up in quantity of replies rather than quality, the need to identify user intent has become a main concern to improve these agents. Dialog act (DA) classification tackles this concern, and while existing studies have already addressed DA classification in general contexts, no training corpora in the context of e-commerce is available to the public. This research addressed the said insufficiency by building a text-based corpus of 7,265 posts from the question and answer section of products on Lazada Philippines. The SWBD-DAMSL tagset for DA classification was modified to 28 tags fitting the categories applicable to e-commerce conversations. The posts were annotated manually by three (3) human annotators and preprocessing techniques decreased the vocabulary size from 6,340 to 1,134. After analysis, the corpus was composed dominantly of single-label posts, with 34\% of the corpus having multiple intent tags. The annotated corpus allowed insights toward the structure of posts created with single to multiple intents.},
  url       = {https://www.aclweb.org/anthology/D19-5108}
}

@mjpost
Copy link
Member

mjpost commented Nov 7, 2019

It was the same under econlp/proceedings/cdrom/bib—just a single backslash.

Maybe softconf isn't using the latest?

@davidweichiang
Copy link
Collaborator

I think ACLPUB/START doesn't try to do any TeX processing. The XML comes to the Anthology with TeX embedded in it, and the Anthology code is supposed to finish.

START did merge the commit that removed TeX processing very recently (just before exporting D19).

@mjpost mjpost closed this as completed in 5c5f0f1 Nov 8, 2019
najtin pushed a commit to ir-anthology/ir-anthology that referenced this issue Jun 9, 2021
* Disambiguate Fei Liu (closes acl-org#614)
* Correct and add variant for Li Lucy (closes acl-org#630)
* Add alias for John S. Y. Lee (closes acl-org#613)
* correct Gonzalez-Agirre (closes acl-org#634)
* Fixed title (closes acl-org#626)
* removed backslash on D19-5108 (closes acl-org#628)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants