Skip to content

A quick tutorial on semantic parsing with graphs

alexanderkoller edited this page May 23, 2021 · 1 revision

A quick tutorial on semantic parsing with graphs

Now let's try to parse with a more complicated grammar: an IRTG with a string interpretation and graph interpretation, also known as "synchronous hyperedge replacement grammar".

First, download Alto and take some first steps.

Loading a grammar

Load an example grammar that can map between strings and graphs. For instance, you can use the grammar hrg-iwcs15.irtg from the Alto examples directory. Go to File -> Load IRTG and select the grammar file. This should open a window that looks as follows:

Screen Shot 2015-07-26 at 14.03.31.png

The first four columns specify a weighted regular tree grammar (RTG) describing the grammatically correct derivation trees. The fifth and sixth column specify the homomorphisms into the string and graph algebra.

Observe that this is exactly the example grammar from Koller, IWCS 2015, which see for a more detailed explanation of the grammar itself.

Parsing strings

You can now parse a sentence and have it translated into a graph by selecting Tools -> Parse and typing a sentence in the "string" field. Try, for instance, the string "the boy wants to go". This will open a parse chart window, which looks like this:

Screen Shot 2015-07-26 at 14.04.07.png

The parse chart is another weighted RTG which describes only those derivation trees that are grammatically correct and also consistent with your input. You can look at these derivation trees by selecting Tools -> Show Language. In the language window which opens up, select View -> Add View twice so you can look at the string interpretation and the graph interpretation along with the derivation tree. The result should look like this:

Screen Shot 2015-07-26 at 14.04.27.png

This tells you that your sentence had a single derivation with this grammar, and shows you the derivation tree, the terms over the string and the graph algebra to which the homomorphisms map the derivation tree, and the values of these terms in the two algebras. Note in particular that control behavior of "wants" is represented correctly, in that the "boy" node is the ARG0 of two different predicates.

Parsing graphs

Instead of parsing a string, you can also parse a graph instead. This will produce a parse chart representing all derivations of the grammar that are compatible with the graph. Just as before, you can then enumerate the different derivations and read off strings from them.

To try this, go back to the original window that shows hrg.irtg and select Tools -> Parse again. You parse a graph by typing its string representation into the "graph" field. With the example grammar, you can parse the following graph:

 (w / want  :ARG0 (b / boy)  :ARG1 (g / go :ARG0 b))

As before, this will open a parse chart window. This time, it looks like this:

Screen Shot 2015-07-26 at 14.04.53.png

Observe that this chart contains nonterminals like NP,[b<root> {b_b}], while the earlier chart for string parsing had nonterminals like NP,0-2. This is because we now need to represent parts of graphs in parsing, instead of parts of strings (= substrings).

Click on Tools -> Show Language to walk through the different derivations for this input graph, and select the string interpretation view to see how these map to strings.

Where to go from here?

You are now ready to write your own IRTGs. Have a look at hrg-iwcs15.irtg (it's a text file) to familiarize yourself with the syntax. Notice that each RTG rule, say A -> f(B,C), comes with single terminal symbol f, and that no two rules use the same terminal symbol. The lines just below which start with [string] and [graph] specify the value of f under the two homomorphisms. The symbol * in the string interpretation means string concatenation. The symbols r_xxx, f_xxx, and merge evaluate to rename, forget, and merge operations of the s-graph algebra, respectively. Constants in the graph algebra must be enclosed in quotes, and their syntax roughly follows the syntax for AMR graphs. Annotations in angle brackets, such as <root>, after a node u indicate that u is a source node (in this case, a root-source).