-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect derivations on "larger" grammars #62
Comments
Does the grammar have two rules with the same terminal symbol? |
No. And as I said, I can comment out two of the new rules and everything works or I keep all four of the new rules and comment out two of the old rules (which worked fine before) and everything works -- except derivations with the two commented-out rules of course. |
And we should really have a check for the same terminal symbol case while reading an irtg -- this is a bad trap people can fall into. |
Is this problem still there? Could you post the grammar so I can reproduce the behavior? |
Yes, steps to reproduce:
It seems that merely adding two rules that are not even part of the now resulting tree prohibits the correct (shortest) tree to be found: Edit in case this cannot be reproduced later on: the minecraft-nlg commit I tested this with is 4da39a42cdfbf84f1c835e074fb0d99facbf67ec |
I checked the derivation and the original tree is still derivable with the extended irtg, it still has the correct interpretations and it has the better score. So it seems like something weird is going on in the viterbi decoding. |
This is the relevant snippet from my code:
i.e. viterbi does not find the correct tree. I spoke with @jgroschwitz and he asked me whether a concrete ta works. And indeed, adding |
I think there's something funky with the way rules are stored in lazy automata. On the one hand, the RuleStore class doesn't always work as I expect it to (I have more a general feeling than a concrete example though, unfortunately). On the other hand, whenever one implements a new TreeAutomaton class, I think one has to make sure that getRulesBottomUp and getRulesTopDown store the rules properly; but I am not aware of this being documented or really what the best practice should be. |
Hi, I can't reproduce this; the test succeeds for me even after removing the comment lines, with both the current Git revision and with 4da3. Could you write a self-contained unit test for Alto that exhibits the problem? (The grammar can live in its own file, we'll put it in the resources directory.) |
@alexanderkoller test is now in a new branch: |
Great, thanks. Now I can reproduce it. |
Some preliminary observations:
|
I have a curious case where I grew my grammar with 46 rules to 50 and derivations only needing the initial 46 rules suddenly changed (i.e., the derivation produced by viterbi is correct but not optimal)
The derivations I am looking at (both the correct one and the incorrect one) do not use any of the four new rules.
Commenting out any(!) two unrelated rules (two of the new ones or two of the old ones) fixes the derivation. Therefore, this seems to be an alto bug to me.
This is just an initial issue description, will update later with more info once I had some time to look into this myself.
The text was updated successfully, but these errors were encountered: