Skip to content

StatFSMPaperSite

cteichmann edited this page Jul 11, 2016 · 15 revisions

This page explains how we obtained the data presented in our StatFSM paper and how to replicate our experiments.

Random Automata

The random automata we used in our evaluations can be found here. Each folder in the archive has the form 'x_y' where 'x' corresponds to the 'l' parameter mentioned in the paper and '0.y' corresponds to the gamma parameter. There are five automata in every folder, written in the normal Alto format for tree automata. We computed the evaluation stats on each of them, but the general trends in the data were the same. The shell script and configuration files for generating the automata can be found here. The configuration files have the following fields:

folder - where to put the random automata once they have been generated

fileNamePrefix - what to name the files (for the final file the number of the generated automaton, plus the file ending .auto will be added, files are always just overwritten)

size - the l parameter from the paper

toGenerate - how many automata the program should generate

seed - the random number seed that should be used, results should be the same whenever the program is run, as long as the random number seed is not changed, but there might be variations in the way that the random number generator works depending on the plattform.

alpha - the gamma parameter from the paper

The original automata were build with the jar with dependencies from this version of the alto code. The jar is put in the same folder as the script and the config files, then the script is executed. The java main for generating random automata is: de.up.ling.irtg.script.CreateRandomAutomata.

Convergence Experiments

Clone this wiki locally