Skip to content

Latest commit

 

History

History
37 lines (30 loc) · 2.19 KB

README.md

File metadata and controls

37 lines (30 loc) · 2.19 KB

SemEval_2019_Task6

Identifying and Categorizing Offensive Language in Social Media

Team Name : JU_ETCE_17_21

System Description Paper : https://www.aclweb.org/anthology/S19-2118

Sub-tasks :

  1. Sub-task A: Offensive language identification
  2. Sub-task B: Automatic categorization of offense types
  3. Sub-task C: Offense target identification

BibTex

@inproceedings{mukherjee-etal-2019-ju,
    title = "{JU}{\_}{ETCE}{\_}17{\_}21 at {S}em{E}val-2019 Task 6: Efficient Machine Learning and Neural Network Approaches for Identifying and Categorizing Offensive Language in Tweets",
    author = "Mukherjee, Preeti  and
      Pal, Mainak  and
      Banerjee, Somnath  and
      Naskar, Sudip Kumar",
    booktitle = "Proceedings of the 13th International Workshop on Semantic Evaluation",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/S19-2118",
    doi = "10.18653/v1/S19-2118",
    pages = "662--667",
    abstract = "This paper describes our system submissions as part of our participation (team name: JU{\_}ETCE{\_}17{\_}21) in the SemEval 2019 shared task 6: {``}OffensEval: Identifying and Catego- rizing Offensive Language in Social Media{''}. We participated in all the three sub-tasks: i) Sub-task A: offensive language identification, ii) Sub-task B: automatic categorization of of- fense types, and iii) Sub-task C: offense target identification. We employed machine learn- ing as well as deep learning approaches for the sub-tasks. We employed Convolutional Neural Network (CNN) and Recursive Neu- ral Network (RNN) Long Short-Term Memory (LSTM) with pre-trained word embeddings. We used both word2vec and Glove pre-trained word embeddings. We obtained the best F1- score using CNN based model for sub-task A, LSTM based model for sub-task B and Lo- gistic Regression based model for sub-task C. Our best submissions achieved 0.7844, 0.5459 and 0.48 F1-scores for sub-task A, sub-task B and sub-task C respectively.",
}

Author

Preeti Mukherjee Mainak Pal