Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracting terminals in Transformer? #1469

Open
alexthomasv opened this issue Sep 15, 2024 · 5 comments
Open

Extracting terminals in Transformer? #1469

alexthomasv opened this issue Sep 15, 2024 · 5 comments
Labels

Comments

@alexthomasv
Copy link

I am trying to extract terminals in the Transform method. The way I have the rule set up is the following:

rel_op: "==" | "<" | ">" | "<=" | ">=" | "!=" | "<:" | "≠" | "≤" | "≥"
I have a function:

def rel_op(self, items):
        # print("rel_op", items)
         return items

But it always prints out a []

Whenever I try to make this rule a terminal by capitalizing it to REL_OP and replacing it with all old occurrences of rel_op, I get the following...

lark.exceptions.UnexpectedToken: Unexpected token Token('LESSTHAN', '<') at line 155, column 16.

The rule:
REL_OP: "==" | "<" | ">" | "<=" | ">=" | "!=" | "<:" | "≠" | "≤" | "≥"
Why does changing the rule to a terminal now cause this issue?

Thank you!

@erezsh
Copy link
Member

erezsh commented Sep 16, 2024

Why does changing the rule to a terminal now cause this issue?

Hard to say without seeing the whole grammar.

But, you can just use the ! modifier like this:

!rel_op: "==" | "<" | ">" | "<=" | ">=" | "!=" | "<:" | "" | "" | "" 

That will keep the terminals in the tree.

@KnightChaser
Copy link

I also had the same question and found the solution to this issue thread by chance. Thanks!

@erezsh
Copy link
Member

erezsh commented Sep 24, 2024

If you have a suggestion for how the docs could be improved (for this case or in general), let me know.

@KnightChaser
Copy link

Doesn't the official documentation describe this case? I guess you can make a simple standalone case for it if so.

For my case, I used Lark to develop a custom language, and such cases occur occasionally because I sometimes have to know which operators or keywords were used while parsing a certain text block or Lark rule. I think developers with Lark in similar circumstances or purposes would eventually encounter this case.

// Variable declarations and assignments (example)
let_statement: "let" name "=" expression
assignment: name assign_operator expression
!assign_operator: "=" | "+=" | "-=" | "*=" | "/=" | "%=" | "**=" | "//="
name: CNAME

@EmperorArthur
Copy link

@KnightChaser What I found worked was adding terminals like:

MINUS_EQUALS: "-="

This would then be passed to a transformer / visitor function of the same name. It also changes parse errors from "Expected one of __ANON_0, ..." to "Expected one of MINUS_EQUALS, ..."

You don't even need to change your "!assign_operator" to use the new terminals, but you could and drop the exclamation point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants