Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusing error message in case of a syntax error #19

Open
ethindp opened this issue Aug 23, 2023 · 4 comments
Open

Confusing error message in case of a syntax error #19

ethindp opened this issue Aug 23, 2023 · 4 comments

Comments

@ethindp
Copy link

ethindp commented Aug 23, 2023

I'm very confused. The peginator-cli binary is giving me this error:

Error: expected end of input
--> ada.peg:6801:1
 |  
 |  
 |  ^

But my grammar certainly isn't done on this line (the number of lines are primarily because I generated the unicode character classes that I needed to match.... Probably not the "best" way I could do it, but it (should) work.) Anyway, it's getting stuck on this:

@string
@check(crate::checks::check_identifier)
Identifier =
IdentifierStart {IdentifierStart | IdentifierExtend} !(Abort | Abs | Abstract | Accept | Access | Aliased | All | And | Array | At | Begin | Body | Case | Constant | Declare | Delay | Delta | Digits | Do | Else | Elsif | End | Entry | Exception | Exit | For | Function | Generic | Goto | If | In | Interface | Is | Limited | Loop | Mod | New | Not | Null | Of | Or | Others | Out | Overriding | Package | Parallel | Pragma | Private | Procedure | Protected | Raise | Range | Record | Rem | Renames | Requeue | Return | Reverse | Select | Separate | Some | Subtype | Synchronized | Tagged | Task | Terminate | Then | Type | Until | Use | When | While | With | Xor);

@char
IdentifierStart =UppercaseLetter
| LowercaseLetter
| TitlecaseLetter
| ModifierLetter
| OtherLetter
| LetterNumber;

@char
IdentifierExtend =NonspacingMark
| SpacingMark
| DecimalNumber
| ConnectorPunctuation;

NumericLiteral = @:DecimalLiteral | @:BasedLiteral;
# Dies where this comment is
@skip_no_ws
@check(crate::checks::check_decimal_literal)
DecimalLiteral = integer_part:Numeral ["." fractional_part:Numeral] [exponent:Exponent];

@string
@skip_no_ws
Numeral = Digit {["_"] Digit};

@string
@skip_no_ws
Exponent = i'E' [Add] Numeral | i'E' Subtract Numeral;

@char
Digit = '0' .. '9';

@check(crate::checks::check_based_literal)
BasedLiteral =
base:Base "#" integer_part:BasedNumeral ["." fractional_part:BasedNumeral] "#" [exponent:Exponent];

@string
@skip_no_ws
Base = Numeral;

@string
@skip_no_ws
BasedNumeral =
ExtendedDigit {["_"] ExtendedDigit};

@char
ExtendedDigit = Digit | i'A' .. i'F';

At a glance I can't see any kind of syntax error, so I'm quite confused. Does order of the rules matter? (I hope not...)

@badicsalex
Copy link
Owner

The order of the rules don't matter. The error is very misleading though, the problem is that the directive is called @no_skip_ws, and not @skip_no_ws

@ethindp
Copy link
Author

ethindp commented Aug 23, 2023

@badicsalex Post-processing on directives and rejecting unrecognized directives and saying "Unknown directive x" would be better.

@ethindp
Copy link
Author

ethindp commented Aug 23, 2023

Also, the error message when misusing @ overrides doesn't give you any line information. E.g.: I get Error: Enum '@:' fields have to be used exactly once in all choice branches, and must not be used in closures or optional parts, but this tells me nothing about where the error is, just that it exists. Would it be possible to include error information with these?

@badicsalex badicsalex changed the title CLI claiming "end of input expected" but that's an empty line? Confusing error message in case of a syntax error Aug 24, 2023
@badicsalex
Copy link
Owner

badicsalex commented Aug 24, 2023

Notes to myself (or anyone who's willing to fix this):

This error is caused by the structure of the peginator grammar, namely the first rule:

Grammar = {(rules:Rule | rules:CharRule | rules:ExternRule) ";"} $ ;

Parsing the rule list stops at the first weird directive, but is successful otherwise. Then comes the EOI check, and that throws an error. Unfortunately the EOI check happens at the same place as the first parsing failure (the position of the "@", since the directive check strings contain the @ already).

If there are multiple furthest errors, all of them should be stored.
But this should be opt-in, as this will mean a lot of allocations, and keep in mind that ParseState is cloned a lot, so this would severely affect parsing speed. Maybe a Cow<Vec<>> (or an optimized small vec crate with cow semantics) would be appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants