Name		Name	Last commit message	Last commit date
parent directory ..
figures		figures
s_conj		s_conj
s_np_vp		s_np_vp
README.md		README.md
evaluate.py		evaluate.py
results.pickle		results.pickle
visualise.ipynb		visualise.ipynb

README.md

Systematicity

This folder contains the data for the systematicity tests, which is described in section 4.1 of the paper.

This folder also contains a script for evaluation (evaluate.py), a notebook to visualise results (visualise.ipynb), and a pickled file of our systematicity results for the paper.

Systematicity setups

The systematicity test has two different setups:

S --> S conj S

The S --> S conj S setup considers the systematic recombination of two sentences into a new sentence. This setup has two conditions:

Systematic recombinations of sentences that are minimally different. In this setup, the consistency of the translation of S2 across translations of S1 and S2 and S1' and S2 is considered, where S1 and S1' are synthetic sentences that differ in only one noun.
Systematic recombinations of sentences that are ery different different. In this setup, the consistency of the translation of S2 across translations of S1 and S2 and S3 and S2 is considered, where S1 and S3 are different synthetic sentences.

For this condition, three data sources are available, that can be found in the respective subfolders. Each file contains a concatenation of a synthetic sentence template and a sentence from the indicated data source:

synthetic contains three files per template (1 - 10), one with the original sentence (S1), one with the minimally different first sentence (S1') and one with a different first sentence (S3). There are no target translations.
semi_natural follows the same pattern, but for semi-natural data.
natural follows the same pattern, but for natural data.

S --> NP VP

The S --> NP VP considers the systematic recombination of noun- and verb phrases. Because this test requires control over the sentence structure and properties to ensure that the recombination is correct, this test cannot be conducted with natural data. There are two data sources available, that can be found in the respective subfolders:

synthetic contains three files per template (1 - 10), one with the original sentence, one in which a noun in the NP is adapted and one in which a noun in the VP is adapted.
semi_natural follows the same pattern, but for semi_natural data and for the NP only.

Usage

To run the test for a specific setup and condition (synthetic, semi_natural or natural), use your model to translate all files in the respective folder. After that, you can use the evaluation script to systematically compare the translations and compute consistency scores, and the visualisation notebook to visualise your results (run as is, the visualisation notebook will visualise the systematicity results from the paper).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

systematicity

systematicity

README.md

Systematicity

Systematicity setups

S --> S conj S

S --> NP VP

Usage

Files

systematicity

Directory actions

More options

Directory actions

More options

Latest commit

History

systematicity

Folders and files

parent directory

README.md

Systematicity

Systematicity setups

S --> S conj S

S --> NP VP

Usage