Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

code inspection: parser kernel doesn't hand over correctly from error recovery phase #25

Closed
GerHobbelt opened this issue Oct 29, 2017 · 1 comment

Comments

@GerHobbelt
Copy link
Owner

Following up on #21, while doing a bit of code inspection on my own work through copy-edit/git-compare/diff via Beyond Compare, I noticed that the parser kernel has a few very subtle bugs in the error recovery parse loop: as the parse loop is duplicated in the error recovery section (so that I can optimize the main parse loop for regular operations and not worry about the special handling that is required for error recovery there), it does not hand over / drop out into the outer parse loop correctly:

  • when parseError() produces a 'parser return value', that one is DESTROYED in the ACCEPT phase of the outer loop.

  • the outer loop can be further optimized when it doesn't have to worry about a still-active 'recovery phase', i.e. recovering === 0 should be a precondition in the outer parse loop.

  • when (edge case) the lexer also is in the habit of producing TERROR tokens (some of my grammars do this), then we will loose their yyval and yylloc! Hence we must differentiate between a TERROR as a replacement token set up in the parser kernel error recovery section and a TERROR token produced by the lexer: the latter is an error token too, but should only indirectly trigger error recovery by the parser.

    Hint To Self: this means that an error term in a grammar production has an associated value which is either a parser error recovery info object or a lexer-produced yyvalue, depending on whether the lexer TERROR-or-other token triggered parser error recovery or not! ... Talk about complex internals... 🤡

GerHobbelt added a commit that referenced this issue Oct 30, 2017
…n cross-compared and adjusted to suit our original intent as described in the issue #25: see the comments (edited) for the important parts of this work.
GerHobbelt added a commit that referenced this issue Oct 30, 2017
…erved that `retval` isn't always teated properly when its *potential value* is produced by `parseError()`: only when `parseError()` produces a sensible value (i.e. *not* `undefined`!) should that value be produced by the parser. Otherwise a parse *error* should produce the value `false` to signal parse/match failure on the given input.
GerHobbelt added a commit that referenced this issue Oct 30, 2017
… the preceeding commits: `action === 0` is the error parse state and that one, when it is discovered during error **recovery** in the inner slow parse loop, is handed back to the outer loop to prevent undue code duplication. Handing back means the outer loop will have to process that state, not exit on it immediately!
GerHobbelt added a commit that referenced this issue Oct 30, 2017
…reset/cleanup the `recoveringErrorInfo` object as one may invoke `yyerrok` while still inside the error recovery phase of the parser, thus *potentially* causing trouble down the lane for subsequent parse states. (This is another edge case that's hard to produce: better-safe-than-sorry coding style applies.)
@GerHobbelt
Copy link
Owner Author

Finished work on this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant