Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terminal collision when importing the same terminal from different grammars #1448

Open
erezsh opened this issue Aug 15, 2024 · 0 comments
Open
Labels
discussion Discuss new features or other changes enhancement

Comments

@erezsh
Copy link
Member

erezsh commented Aug 15, 2024

The problem occurs because the terminals are defined in different namespaces, and therefor different names and instances, but may match the same text.

An example is described here: https://stackoverflow.com/questions/78692900/unclear-reason-for-terminal-collision

The reason terminals are put in namespaces is twofold (as far as I can remember):

  1. To avoid collision of terminals with same name but different pattern. This issue can be solved without a namespace, by naming the terminals based on their pattern (or the hash of the pattern).
  2. To know how to dispatch the token to the correct transformer. For example, a__X might need different handling than b__X.

Issue 2 is a bit harder to solve. We could decide that tokens aren't namespaced anymore, and let the users deal with the fallout.

Or we can assign a namespace for each rule, and rename the tokens when constructing the tree. That would solve this issue at a small performance cost. However, that still leaves lexer_callbacks, for which we won't be able to provide a namespace anymore. Perhaps it's an acceptable loss? (note it will mean that creating lexer_callbacks from the given transformer won't be viable anymore)

A very different view on the matter, is that these collisions would ideally be solved by the contextual lexer, which would be possible if they were solved by the parser. In other words, LALR(k) has a lower chance of producing such a colliison. The higher k is, the lower the chance. But, it's ofc very hard to implement or we would have already done so. And anyway, it doesn't guarantee that there won't be any collisions, just lowers the chances.

If we can't find a better approach, I would support adding a new flag such as "join_terminals" or "delayed_namespace", that will enable this behavior of joining terminals by value, and adding the namespace during tree construction. Perhaps the same can also be done for identical rules.

@erezsh erezsh mentioned this issue Aug 15, 2024
3 tasks
@erezsh erezsh added enhancement discussion Discuss new features or other changes labels Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discuss new features or other changes enhancement
Projects
None yet
Development

No branches or pull requests

1 participant