Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tree transforms #29

Open
oguzalb opened this issue May 3, 2013 · 13 comments
Open

Tree transforms #29

oguzalb opened this issue May 3, 2013 · 13 comments

Comments

@oguzalb
Copy link

oguzalb commented May 3, 2013

It would be better if parsimonious had something like Supress in Pyparsing, this way we would be able to check for a keyword for existence, but don't represent it as a node in the tree.

@erikrose
Copy link
Owner

erikrose commented May 3, 2013

It's coming! It'll probably look something like this: #23 (comment).

@erikrose
Copy link
Owner

Suppress and Select are two pieces of sugar that would be nice. Perhaps a recursive flattener like in PyPy would help as well. I'll do a survey of Parsimonious grammars in the wild and see what pops out.

Spelling strawman:

foo = "," <something_important> ","
bar = >something_boring< something_important >another_boring_thing< also_important third_important
multiple = <multiple selected? terms> "boring"

@erikrose erikrose changed the title Silent node feature Tree transforms Jul 14, 2014
@erikrose
Copy link
Owner

Transforms should also (do I even need to mention it?) be heritable, so we can suppress whitespace once and have it stick elsewhere:

_ = >" "+<
things = thing _ thing _ thing  # would return just thing, thing, thing

@erikrose
Copy link
Owner

<<thing>> or >>thing<< are possibilities for recursive flattener thing spelling. Haven't thought them out at all yet, but it took me months to think of them, apparently. They're nice because then all the transforms are just >s and <s.

However, I probably won't implement flattening until I see some compelling use cases. RPython's page gives a right-recursive grammar as an example, and their example, at least, is much simpler represented using a quantifier.

@erikrose
Copy link
Owner

In summary, I can think of 4 transforms sufficiently useful to include:

  • Igoring
  • Selecting
  • Flattening
  • Lifting

@jkmacc-LANL
Copy link

Writing and testing grammars in Parsimonious has been gloriously simple, but transforming the parse tree has so far been difficult. For example, I'm trying to write a general JSON-izer for Parsimonious parse trees that just spits out node info for the named rules (not anonymous literal matches, for example), and I'm having trouble doing it by implementing NodeVisitor.generic_visit.

It would be a huge help to have more explicit support for transforms like lifting and ignoring (#29, #36), or some documentation/examples about how to implement them myself (#99 , #48). Are there plans to visit (no pun intended) these issues in the near-future? Alternately, could you link to any snippets of Parsimonious code that do things like the ignoring or lifting?

@erikrose
Copy link
Owner

Thanks for asking! First, I'm going to be talking about Parsimonious on Podcast.__init__ in the next month or two, so I'm going to try to get some real docs in place before then.

Second, I haven't had much time to hack on Parsimonious lately, but tree transforms are the next thing on the roadmap after switching the precedence of / and (space)—that is, ors and sequences—to match what every other PEG parser does, including the original paper.

Finally, there's lot of ignoring going on in my grammar in DXR: see https://github.com/mozilla/dxr/blob/master/dxr/query.py#L309, which takes a term like "+foo:bar". It looks at plus and filter, but it completely ignores colon, and the colon propagates no further up the visitation chain.

Does that help?

@jkmacc-LANL
Copy link

jkmacc-LANL commented Jan 23, 2017

Sweet! I enjoyed listening to you on Talk Python to Me, and I'm looking forward to your next interview.

I think I understand the ignoring happening at your link; thank you. What do you do if you look at your node or its children, and decide you wanted to ignore everything? Simply returning will return None, but that's still something. For example, in the JSON-izer I'm playing with, generic_visit(node, children) looks at node.expr_name, and returns nothing if that is empty. None becomes null in json.dumps, though, and I'm trying to pretend it was never visited (or possibly lift its children, if found). I think this is a kind of tree reduction; I'm just not sure how to implement it.

@erikrose
Copy link
Owner

What do you do if you look at your node or its children, and decide you wanted to ignore everything? Simply returning will return None, but that's still something.

You want the node to not exist in the tree? Arrange to lop it off in the visitor for the parent of that node. For instance, when it sees None come back from visit_some_child_thing(), have it omit that item from its return value.

@jkmacc-LANL
Copy link

jkmacc-LANL commented Jan 27, 2017

(I didn't mean to hijack this issue. It seemed like the right place for this.) I think I could JSON-ize any parse tree, regardless of rule structure, so I'm only using generic_visit. There is no visit_some_child_thing. This feels do-able, with some better understanding of how to prune generically.

Here's a gist illustrating the visitor using your "(( bold stuff ))" example. I think the "expr_name": "text" node should be an only child, as "((" and "))" are anonymous siblings, but instead it has the two nulls. This seems like an ignoring operation.

For instance, when it sees None come back from visit_some_child_thing(), have it omit that item from its return value.

Ah, I should "de-None" the children. Got it, thank you!

If this visitor encounters an anonymous parent with named children, it returns the children (a list of dictionaries), when I really just intend to lift the children to the level of the parent, which sounds like a simple lifting operation. Still not sure how to implement this one...

Thanks for your responsiveness.

@erikrose
Copy link
Owner

erikrose commented Feb 2, 2017

I think you're almost there. It looks like you now have "uninteresting" nodes returning None. Now if you simply filter out the Nones from out before you return it, you might be in business.

@jkmacc-LANL
Copy link

jkmacc-LANL commented Feb 2, 2017

That was it. And, as far as lifting children, that was as simple as flattening a children list that may contain lists. Thank you for your help!

@erikrose
Copy link
Owner

erikrose commented Feb 3, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants