-
-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to parse parts of the input? #1424
Comments
See #1371 It's not exactly easy and especially for C-like syntax that is already tricky to parse with lark it might be quite annoying. |
Wow, thanks for a quick response.
This gives the following error:
I'm using lark 1.1.9. |
As it says in the error message, the terminal can't be zero width. Use |
Hm, I can't figure out how to proceed.
I can't force |
Use Or choose the easier and quicker root and use the |
From what I see MegaIng@4975608 doesn't expose a way to use transformers on the matched code. Could this be improved so I can also use transformers? What would be the needed steps? For now I went with the Earley parser. I'm stuck on the below code. What should it behave like?
It yields this output:
Shouldn't there be an ambiguity between |
Yeah, that does indeed look like a bug, maybe @erezsh has an idea what is going on. With regard to |
This function should do what you want using the scan method: def scan_and_replace(parser: lark.Lark, text: str, replacement: Callable[[lark.ParseTree], str],
start: str = None) -> str:
"""
Scans the `text` and replaces all matches of `parser` by the value returned by `replacement`
given the corresponding tree.
`start` is for passing in the start rule if required, not the starting position of the scanning.
"""
last = 0
res = ""
for (start_pos, end_pos), tree in parser.scan(text, start=start):
res += text[last:start_pos]
res += replacement(tree)
last = end_pos
return res |
That looks very promising. I will pick up from there. Thanks a lot! |
It's not a bug, you have a space in your text, and only ANY can handle it. When I change to this grammar: !start: any? expr+ any?
any: /.+/
!expr: "asdf" | "qwerty"
%ignore " " I get -
(this is without |
@erezsh No, what actually fixed it is moving
The later doesn't produce any ambiguities with Similar for your grammar, with |
I tested @MegaIng 's example again on the latest master, and looks like this bug is fixed! |
What is your question?
I'm interested in writing a transformer for certain parts of input programs that wouldn't require a full language grammar. So let's say I have the below sample cpp code:
I want to parse just function signatures and transform them e.g. by adding the '_' suffix to all argument names. I.e. the output for the above input program should be:
How could I possibly achieve this using Lark?
The text was updated successfully, but these errors were encountered: