Skip to content

TreeAutomata

alexanderkoller edited this page May 23, 2019 · 4 revisions

Working with Tree Automata and RTGs

Alto supports working with weighted finite tree automata. In particular, these can be read from and written to files, and their languages explored from the GUI.

In addition, Alto implements a variety of algorithms for tree automata, including intersection and inverse homomorphism. These are available through the Java API.

File format

Alto can directly read top-down tree automata from files. An example of the file format looks as follows:

S! -> f(A,A) [1.0]
A -> g(A)    [0.5]
A -> a       [0.5]

This defines a top-down tree automaton with states S and A, over a signature that contains the symbol f (of arity 2), the symbol g (of arity 1), and the symbol a (of arity 0). The state S is marked with an exclamation point (!), and therefore is an initial state. Each rule is equipped with a weight. Thus, for instance, the tree f(g(a),a) is in the language of the automaton, and has weight 1/8.

Every rule of the automaton must contain precisely one terminal symbol, such as f or a; epsilon transitions are not allowed. Furthermore, terminal symbols must be used consistently with respect to their arities. For instance, there could not be a rule A -> f(A) in the above automaton, because f must have arity either one or two, but not both.

States or terminal symbols which contain special characters can be enclosed in single or double quotes. Weights may be left away; in this case the weight of each rule defaults to 1.

Everything from the symbol // until the end of the line, and everything that is enclosed between /* ... */ is treated as a comment.

Save your tree automata in files with the filename extension .auto. This will allow Alto to recognize these files as descriptions of tree automata.

Bottom-up automata

Bottom-up tree automata can be easily encoded in the above format: Just read each rule from right to left. Thus, the example automaton above can be read as a bottom-up automaton with rules f(A,A) -> S etc. One nontrivial difference between bottom-up and top-down automata is that bottom-up automata can have multiple final states, whereas a top-down automaton can (strictly speaking) have only one initial state. The implementation of tree automata in Alto supports multiple initial/final states; simply put exclamation points on more than one state.

Regular tree grammars (RTGs)

Regular tree grammars (RTGs) can be brought into a normal form in which each rule has exactly one terminal symbol. Then the rules of the RTG can be written exactly as above; S and A are read as nonterminal symbols and not as states. The exclamation point (S!) denotes the start symbol.

Working with tree automata in Alto

The Alto GUI has a menu item File -> Load Tree Automaton. Choose your automaton file (with extension .auto) and click "Open". This will open a window showing your tree automaton, which looks as follows:

Screen Shot 2017-05-29 at 21.22.33.png

The tree automaton window has a menu item Tools -> Show Language. This will open a language viewer, in which you can scroll through the different trees in the language of the automaton:

Screen Shot 2017-05-29 at 21.24.32.png

Notice that the status line shows both the number of trees in the language of your automaton, and the weight of the tree that is currently being displayed.

Note: If the menu item Tools -> Show Language is grayed out, double-check two things: (1) Is the tree automaton your active window? Some windows do not allow the "Show Language" function. (2) Does your tree automaton have an initial state (marked with exclamation point)? If not, then the language of your automaton is (trivially) empty, and there is no language to show.