NonCompSST

Repository to accompany the EMNLP 2023 findings short paper that proposes NonCompSST:

@inproceedings{dankers2023noncomp,
    title = "Non-Compositionality in Sentiment: New Data and Analyses",
    author = "Dankers, Verna and Lucas, Christopher G",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2023",
    year = "2023",
}

Data

The sst7 folder contains train, validation and test data for the 7-point scale version of the Stanford Sentiment Treebank, but with sentences that match our stimuli removed.
The noncomposst folder contains the stimuli used for Study 1 and 2 (under stimuli_studyX.tsv), and the maxabs_ranking.tsv, the variant of the ranking that we recommend future work uses when working with our non-compositionality ratings (descriptions of other variants can be found in the paper).

Model training

The sentiment_training folder contains the implementation used to train models on SST-7, and, afterwards, test on the same stimuli presented to humans in Study 2.
Run sentiment_training/scripts/train.sh <model_name> to train a sentiment model and to use it to predict the sentiment for our NonCompSST stimuli.
Afterwards, use python create_ranking.py --folder <folder_name> to generate a non-compositionality ranking from the models' predictions, where the folder name is the parent folder in which the test set predictions for multiple seeds are located.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
noncompsst		noncompsst
sentiment_training		sentiment_training
sst7		sst7
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NonCompSST

Data

Model training

About

Releases

Packages

Languages

vernadankers/NonCompSST

Folders and files

Latest commit

History

Repository files navigation

NonCompSST

Data

Model training

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages