Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subsequent RewriteRules don't transform elements added in previous transform. #257

Open
ndelargy opened this issue Sep 5, 2018 · 5 comments

Comments

@ndelargy
Copy link

ndelargy commented Sep 5, 2018

import scala.xml._
import scala.xml.transform.{RewriteRule, RuleTransformer}

object TransformIssue {
  def addNewElement(): RewriteRule = new RewriteRule {
    override def transform(n: Node): Seq[Node] = n match {
      case <element></element> => <element><new></new></element>
    }
  }

  def addThingElement(): RewriteRule = new RewriteRule {
    override def transform(n: Node): Seq[Node] = n match {
      case <element>{ children@_*}</element> => <element>{ children  }</element>
      case <new></new> => <new><thing></thing></new>
    }
  }

  def change(node: Node): Node =
    new RuleTransformer(
      addNewElement(),
      addThingElement()
    ).transform(node).head

  def changeWorkaround(node: Node): Node = {
    val out1 = new RuleTransformer(
      addNewElement()
    ).transform(node).head

    new RuleTransformer(
      addThingElement()
    ).transform(out1).head
  }

}
import org.scalatest.{FlatSpec, FunSpec}
import org.scalatest._

class TransformIssueSpec extends FlatSpec with Matchers {

  it should "apply transform to created elements" in {
    val output = TransformIssue.change(<element></element>)
    output should be(<element><new><thing></thing></new></element>)
  } // fails

  it should "work the same as the workaround imo" in {
    TransformIssue.change(<element></element>) should equal(TransformIssue.changeWorkaround(<element></element>))
  } // fails

}

When we apply a transform with two rewrite rules: the first one adding a new element, the second one adding children to the new element; then the second rewrite rule does not match on the elements added in the first rule.

When we apply the same RewriteRules in two separate RuleTransformers it does add the children to the elements added in the first step. We would expect the change and changeWorkaround functions to produce the same output.

@ashawley
Copy link
Member

ashawley commented Sep 5, 2018

Indeed, this is a known issue. On the Getting started page, it is written:

Keep in mind that rewrite rules won't compose if they modify children, or modify values that other rewrite rules depend on.

There's a chance that RuleTransformer executes rewrite rules backward. What if you try changing the order?

    new RuleTransformer(
      addThingElement(),
      addNewElement()
    )

If that's not it, then indeed, it's probably that you need to call the transformer on the new elements yourself:

...
      case <element>{ children@_*}</element> => <element>{ change(children)  }</element>

@ashawley
Copy link
Member

ashawley commented Sep 7, 2018

@ndelargy I hope I was able to help and you were able to make progress. Thanks you for giving a simple example case.

@ashawley
Copy link
Member

ashawley commented Sep 7, 2018

I'll keep this open, since it would be helpful if someone investigated whether fixing this is possible.

@ndelargy
Copy link
Author

Hi @ashawley thanks for getting back so quick on this...
We did try reversing the order of the RewriteRules but this didn't fix the issue.

@dcsobral
Copy link
Contributor

The problem is that RuleTransformer recurses before applying the rules, but not after, so any rules dependending on the recursion to be applied will not work on anything added by previous rules.

However, following rules will apply on the way back from recursion. For example:

  def addNewElement(): RewriteRule = new RewriteRule {
    override def transform(n: Node): Seq[Node] = {
      n match {
        case <thing>{ children @ _* }</thing> => <new><thing>{ children }</thing></new>
        case other => other
      }
    }
  }

  def addThingElement(): RewriteRule = new RewriteRule {
    override def transform(n: Node): Seq[Node] = {
      n match {
        case <new>{ children @ _*}</new> => <element><new>{ children }</new></element>
        case other => other
      }
    }
  }

If you call TransformIssue.change(<thing/>), it will return <element><new><thing/></new></element>.

If RuleTransformer was written like this, it would pass the tests:

class RuleTransformer(rules: RewriteRule*) extends BasicTransformer {
    override def transform(n: Node): Seq[Node] = {
        rules.foldLeft(n: Seq[Node]) { (res, rule) => rule transform res.flatMap(super.transform) }
    }
}

However, it would through a stack overflow exception on the example I gave. The only solution I see is break the relationship between RuleTransformer and BasicTransformer, like this:

class NestingTransformer(rule: RewriteRule) extends BasicTransformer {
  override def transform(n: Node): Seq[Node] = {
    rule.transform(super.transform(n))
  }
}

class RuleTransformer(rules: RewriteRule*) {
  val nestingTransformers = rules.map(new NestingTransformer(_))
  def transform(n: Node): Seq[Node] = {
    nestingTransformers.foldLeft(n: Seq[Node]) { (res, transformer) =>
      transformer.transform(res)
    }
  }
}

Fun fact: RuleTransformer above is exactly what the workaround is, while NestingTransformer is what RuleTransfomer was but without the ability to process multiple rules.

I can't see a case for the exact semantics that RuleTransformer provides. It doesn't prevent rewrites from seeing previous rewrites -- the example would work if written <element>{ transform(children) }</element> -- just make it so it won't recurse to see those changes.

dcsobral added a commit to dcsobral/scala-xml that referenced this issue Mar 21, 2020
RuleTransformer for the past eleven years or more has first recursed,
then applied the rewrite rules as it backed out of the recursion.

That prevented rewrite rules from acting on changes of previous rules
unless they recursed themselves.

The workaround has always been chain rule transformers instead of
calling one rule transformer with all the rules. This change basically
re-implements RuleTransformer as that workaround, and introduces
a NestingTransformer which does the nesting.

Clearly, if you have N rules you'll now recurse N times instead of
once, though each rule is still only applied once for each element.

On the other hand, a RewriteRule that recursed would incur in
exponential times, and now they don't need to.

The original behavior has no reason to be. It didn't prevent
rules from seeing each other changes, nor was it particularly
concerned with performance.  With API changes coming on 2.0, I
believe this is the right time to introduce this change.
SethTisue added a commit that referenced this issue Dec 23, 2020
Make RuleTransformer fully recursive [#257]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants