Skip to content

Latest commit

 

History

History
59 lines (47 loc) · 1.82 KB

README.md

File metadata and controls

59 lines (47 loc) · 1.82 KB

Transformer and pointer-generator transformer models for the morphological inflection task

Submission - NYUCUBoulder, Task 0 and Task 2

First download Task 0 data, and build the dataset

git clone https://github.com/sigmorphon2020/task0-data.git
python src/data/task0-build-dataset.py

Apply multitask training augmentation for all languages, and data hallucination augmentation by (Anastasopoulos and Neubig, 2019) for all low-resource languages

python src/data/multitask-augment.py
bash src/data/hallucinate.sh

Sample training sets of low-resource languages, to use for low-resource experiment

python src/data/downsample.py

Run pointer-generator transformer on original datatset and multitask training augmented set (for Task 0).

bash task0-launch-pg-trn.sh
bash task0-launch-pg-aug.sh

Run transformer (Vaswani et al., 2017) on original datatset and multitask training augmented set (for Task 0).

bash task0-launch-trm-trn.sh
bash task0-launch-trm-aug.sh

Pretrain pointer-generator transformer on hallucinated training set (for Task 0).

bash task0-launch-pg-pretrain_hall.sh

Pretrain transformer (Vaswani et al., 2017) on hallucinated training set (for Task 0).

bash task0-launch-pg-pretrain_hall.sh

Code built on top of the baseline code for Task 0 for the SIGMORPHON 2020 Shared Tasks (Vylomova, 2020) Data hallucination augmentation by (Anastasopoulos and Neubig, 2019) You can also run hard monotonic attention (Wu and Cotterell, 2019).

Dependencies

  • python 3
  • pytorch==1.4
  • numpy
  • tqdm
  • fire

Install

make