Skip to content
This repository has been archived by the owner on Sep 1, 2020. It is now read-only.

Add flag to disable withFilter pattern desugaring #1

Merged

Conversation

puffnfresh
Copy link

This adds a flag called -Zirrefutable-generator-patterns which changes
desugaring of code like the following:

for {
  (a, b) <- List(1 -> 'a', 2 -> 'b')
  (c, d) <- List(3 -> 'c', 4 -> 'd')
} yield (a + c, b + d)

Which previously would turn into something like:

List(1 -> 'a', 2 -> 'b').withFilter {
  case (a, b) => true
  case _ => false
}.flatMap {
  case (a, b) =>
    List(3 -> 'c', 4 -> 'd').withFilter {
      case (c, d) => true
      case _ => false
    }.map {
      case (c, d) =>
        (a + c, b + d)
    }
}

With this new flag, it only becomes:

List(1 -> 'a', 2 -> 'b').flatMap {
  case (a, b) =>
    List(3 -> 'c', 4 -> 'd').map {
      case (c, d) =>
        (a + c, b + d)
    }
}

Which creates a few benefits:

  1. We don't have to do a useless call to withFilter
  2. We can use patterns on things without withFilter
  3. Things don't silently disappear when patterns are wrong

This adds a flag called `-Zirrefutable-generator-patterns` which changes
desugaring of code like the following:

    for {
      (a, b) <- List(1 -> 'a', 2 -> 'b')
      (c, d) <- List(3 -> 'c', 4 -> 'd')
    } yield (a + c, b + d)

Which previously would turn into something like:

    List(1 -> 'a', 2 -> 'b').withFilter {
      case (a, b) => true
      case _ => false
    }.flatMap {
      case (a, b) =>
        List(3 -> 'c', 4 -> 'd').withFilter {
          case (c, d) => true
          case _ => false
        }.map {
          case (c, d) =>
            (a + c, b + d)
        }
    }

With this new flag, it only becomes:

    List(1 -> 'a', 2 -> 'b').flatMap {
      case (a, b) =>
        List(3 -> 'c', 4 -> 'd').map {
          case (c, d) =>
            (a + c, b + d)
        }
    }

Which creates a few benefits:

1. We don't have to do a useless call to withFilter
2. We can use patterns on things without withFilter
3. Things don't silently disappear when patterns are wrong
/**
* -Z Typelevel settings
*/
val ZirrefutableGeneratorPatterns = BooleanSetting("-Zirrefutable-generator-patterns", "Treat patterns in for comprehensions as irrefutable. Do not add filter or withFilter calls.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stupidly minor nitpick that probably isn't worth mentioning, but for comprehensions -> for-comprehensions

@non
Copy link

non commented Sep 4, 2014

I definitely like the -Z convention for Typelevel settings.

@@ -768,7 +768,7 @@ abstract class TreeGen {
}

def mkCheckIfRefutable(pat: Tree, rhs: Tree)(implicit fresh: FreshNameCreator) =
if (treeInfo.isVarPatternDeep(pat)) rhs
if (treeInfo.isVarPatternDeep(pat) || settings.ZirrefutableGeneratorPatterns) rhs
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be a little bit too good to be true. Does the logic for detecting irrefutable patterns actually work and was just not applied properly?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larsrh: it appears this is not about detecting irrefutable patterns, it's about expecting them to be.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Blaisorblade exactly. Happy to clarify this if it's confusing somewhere.

@milessabin
Copy link
Member

👍

@milessabin
Copy link
Member

Do we want a SIP-18 style language import as well?

@paulp
Copy link

paulp commented Sep 4, 2014

Here's why there's a filter:

    for {
      (a, b) <- List(1 -> 'a', 2)
      (c, d) <- List(3 -> 'c', 5)
    } yield List(a, b, c, d)

I realize nobody here would write code like that, but hey, turns out that other people do. I know because I already implemented this (I expect that to be a frequent theme around here) and later we had to revert it.

https://issues.scala-lang.org/browse/SI-6968

Nobody can outrun Any. It's like the Terminator. It absolutely will not stop until it has ruined your plans.

@aloiscochard
Copy link

👍

We assume patterns are exhaustive when desugaring with
`-Zirrefutable-generator-patterns` but we should still emit a warning if
we think they might not be.
@puffnfresh
Copy link
Author

@larsrh there isn't much logic there any more, as @paulp shows, there used to be code to treat things like Tuple2 as irrefutable but that was broken when you did silly things and so was reverted. The motivation for the revert was to maintain the following behaviour:

scala> for { (a, b) <- List(1 -> 'a', 2) } yield a
res0: List[Any] = List(1)

But under this new flag, we get the following:

scala> for { (a, b) <- List(1 -> 'a', 2) } yield a
scala.MatchError: 2 (of class java.lang.Integer)
  at $anonfun$1.apply(<console>:8)
  at scala.collection.immutable.List.map(List.scala:276)
  ... 33 elided

I think most people prefer the latter behaviour.

@milessabin while I would love to turn SIP-18 against itself, the checkFeature part is in Typers and I don't think it can be applied a phase which is this early.

@puffnfresh
Copy link
Author

Two things I want to do before we think about accepting this:

  • Figure out what changing scala-reflect.jar does to interoperability with Typesafe Scala (particularly macros)
  • Write a test for this feature.

@ceedubs
Copy link

ceedubs commented Sep 4, 2014

Completely agree both with preferring the latter behavior when a tuple match isn't irrefutable and with figuring out what changing scala-reflect.jar does particularly with macros.

Thanks @puffnfresh! And thanks for the helpful input @paulp.

@Blaisorblade
Copy link

The pull request description describes a case where the change is uncontroversial — one could prove that those withFilter calls are the identity. The discussion and the help text suggests that the change is not limited to such cases, so I'm confused.
Anyway, might I suggest a bit more documentation here before merging?

I think most people prefer the latter behaviour.

You mean you prefer the MatchError? I disagree, and Haskell does as well (see below).

That looks strange... Why is Tuple2 considered irrefutable when matching against values of type Any?

If that's intentional, should I expect a MatchError in the code below?

//Assume the Scala equivalent of Haskell's:
//data T = A x | B | C

for {
  A(x) <- List(A(a), B, C)
} yield x

It seems to me the two cases should be treated in the same way. Do you suggest rejecting the second example as well? GHC accepts the Haskell equivalent:

Prelude> data T = A | B | C
Prelude> [() | A <- [A, B, C]]
[()]

HOWEVER, I now read the linked ticket. People mention parser changes... are you telling me that refutability is so syntactic to be decided by the parser? Oh my.

So @paulp, while Any is certainly bad, here it's not special. It's desugaring things in the parser which is at fault.

@tixxit
Copy link

tixxit commented Sep 4, 2014

@Blaisorblade Yes, I would expect a MatchError in that case too. Basically, I expect <- to behave like a val assignment.

// data T = A x | B | C
val A(x) = B: T // Fails, because B isn't A

@Blaisorblade
Copy link

@tixxit I see, and now I get the argument:

Things don't silently disappear when patterns are wrong

Not going to use this, but it's behind a flag and the user documentation is accurate, so OK.

Also, GHC provides a different syntax for irrefutable patterns. Would something like that in the future make you happier? It shouldn't stop this from merging.

What I would love would be dropping those withFilter when the pattern is in fact irrefutable. But this would require type-aware desugarings — and I realized that's impossible, because for-comprehension's desugaring are completely untyped (and this has some advantages).
Lacking that, it would be cool if an optimizer could remove those withFilter calls — but this is really hard.

@bvenners
Copy link

bvenners commented Sep 4, 2014

It is not clear to me what the opinion is here from the discussion, so I'll add this even though I may be repeating what others said. What I think would be desirable without the flag (i.e., by default) is that if the compiler can prove there will be no match error possible at runtime, then the withFilter is elided. If it can't prove that, then the withFilter is left in. This would make using Scalactic Or nicer, so I'd be quite happy about a tweak like this. For example, would give no withFilter:

for { (a, b) <- List(1 -> 'a', 2 -> 'b') } yield (foo(a) compose bar(b))

But this would give the withFilter, because the compiler can't prove that Any will always match Tuple2:

for { (a, b) <- List(1, 2 -> 'b') } yield (foo(a) compose bar(b))

The flag idea isn't bad either, though I wonder if there could be a more general flag that adds type errors. This behavior might belong more in a compiler plugin though I'd think, like WartRemover.

@Blaisorblade
Copy link

@bvenners That's what I would have hoped for. In particular, the call can be removed if:

  • withFilter's receiver type has shape Foo[T] (i.e., a type constructor applied to one parameter), and
  • the given patterns are exhaustive for T

I guess this is safe with some assumptions on withFilter, which have never been set in stone.
Also, this assumes that the exhaustiveness checker works correctly, and I've seen a few strange warnings there.

The flag idea isn't bad either, though I wonder if there could be a more general flag that adds type errors.

I'm sorry, but I'm lost here on which type errors you'd want.

@bvenners
Copy link

bvenners commented Sep 4, 2014

I'm sorry, but I'm lost here on which type errors you'd want.

@Blaisorblade I just meant that there will likely be many situations where typelevel folks would prefer a compiler error, and perhaps they should just all be enabled by one flag instead of 100 different flags. Then on the other hand, perhaps that's better done in a compiler plugin, with a config file to enable and disable the various options. Else the compiler has too many options and modes, which adds complexity.

@Blaisorblade
Copy link

@bvenners Ah, that's a fair point. If you trust Typelevel, you might want -Ztypelevel-is-my-guide. And for hard-core purists, -Zscalazzi.

But I'd never enable this particular one, so I'd like some granularity: I'd be happy with -Ztypelevel-is-my-guide -Zno-irrefutable-generator-patterns :-)

@puffnfresh
Copy link
Author

@bvenners that has been tried, but here's the problem:

  1. The parser desguars to withFilter and patterns
  2. The type-checker checks that withFilter exists
  3. An optimiser could possibly elide the withFilter calls if the pattern is irrefutable

The problem with step 2 prevents us from using lots of structures which don't support withFilter. The problem with step 3 is that it doesn't happen (try it out with scalac, I haven't observed it working under very simple cases).

This is what scalac (claims to) do but the spec says step 2 shouldn't even happen. I don't have a way to make scalac conform to the spec (and neither did paulp, it seems).

@puffnfresh
Copy link
Author

@Blaisorblade I don't want to compare how Haskell desugars do-notation since I think it can be done better (but possibly not by us).

With this PR you get a warning now on refuted patterns. For example, given this:

sealed trait X
case class A(a: Int) extends X
case object B extends X
case object C extends X

object Test {
  for {
    A(x) <- List(A(1), B, C)
  } yield ()
}

We get this result when compiling:

$ ./qbin/scala -nc -Zirrefutable-generator-patterns -Xfatal-warnings /tmp/Test.scala
/tmp/Test.scala:8: warning: match may not be exhaustive.
It would fail on the following inputs: B, C
    A(x) <- List(A(1), B, C)
         ^
error: No warnings can be incurred under -Xfatal-warnings.

@Blaisorblade
Copy link

Forgot to mention: happy to see you here, @paulp!

@paulp
Copy link

paulp commented Sep 4, 2014

@Blaisorblade I brought up Any as a representative of heterogeneous lists of the List[Whatever] variety. From first principles one could reasonably expect language support for the notion that one's generator will only generate elements of the expected shape, but there is no such facility. It's not dissimilar to the contains(Any) problem, in that generality is a two-way street.

@paulp
Copy link

paulp commented Sep 4, 2014

I hope this ticket serves to illustrate that even the smallest changes to the compiler often involve a dizzying array of consequences, and also that there will often be a long history of past efforts orbiting any seemingly simple change which you care to pursue. You will want to hone your facility with git so you can read some punchy commit messages, many of which I expect have never been seen except by the author.

Also the ticket database remains an important source of background, which I have to mention because it is a daily occurrence to see someone who should know better opening a new ticket duplicating one I opened up to five years ago yet which languishes in its meticulous obscurity amongst the ~2000 and climbing open tickets.

@bvenners
Copy link

bvenners commented Sep 4, 2014

@puffnfresh For my use case, I do have a withFilter on Or, so fixing step 3 and just optimizing the withFilter out would address my need. It wouldn't solve the issue with an optimize-out-able withFilter not being present at type check time, but would help Or users. The reason I care is explained here:

https://groups.google.com/forum/#!msg/scalatest-users/XPtBcsFvmDk/I5fhy1AiKSIJ

Have to scroll down to where I say "Actually that's on purpose." If step 3 worked, then that user would not have encountered a compiler error where he did in the first place. Admittedly I suspect this is a very rare case, but it would be a step-wise improvement to the compiler.

@Blaisorblade
Copy link

@Blaisorblade I brought up Any as a representative of heterogeneous lists of the List[Whatever] variety. From first principles one could reasonably expect language support for the notion that one's generator will only generate elements of the expected shape, but there is no such facility.

I'd want a static error for that, and that's possible via a type annotation.

Also, compared to non-OO functional languages with algebraic data types (ADTs), as a built-in, Scala subtyping relates both an ADT with its variants (above, T and C) and a supertype with its type. In Haskell, the difference is rather clear — but in Scala, they're both expressed with subtyping, so you need a solution which supports both.

It's not dissimilar to the contains(Any) problem, in that generality is a two-way street.

I agree at least that typesystems are useful because they rule out (some) bad code while allowing good one.

@puffnfresh
Copy link
Author

@paulp in pursuing this change, I find your commits and your issue comments. They were very useful to see why Typesafe didn't include them into scalac but I see the behaviour they desire as broken and so I am happy with my change.

@Blaisorblade
Copy link

@milessabin while I would love to turn SIP-18 against itself, the checkFeature part is in Typers and I don't think it can be applied a phase which is this early.

Ignoring the code, for SIP-18 you need to check whether implicits of some type are available, and there's no way to know that before typechecking.

IIUC, Haskell-style flags can be understood by the parser so they wouldn't have this problem. They're more limited, but I've not seen yet anybody using the extra power.
We're going to have also other flags which affect pre-Typer behavior (see #7 for the Unicode stuff), so we might consider a Haskell-like syntax for Typelevel language flags.

@paulp
Copy link

paulp commented Sep 5, 2014

I'd want a static error for that, and that's possible via a type annotation.

@Blaisorblade If you have to tell the compiler so the compiler can tell you, I suggest cutting out the middleman.

@Blaisorblade Blaisorblade mentioned this pull request Sep 5, 2014
@Blaisorblade
Copy link

@Blaisorblade If you have to tell the compiler so the compiler can tell you, I suggest cutting out the middleman.

To me this boils down to your old "don't infer Any" proposal, for which I've opened #16. Let's continue there.

@bvenners
Copy link

bvenners commented Sep 5, 2014

@puffnfresh it occurred to me last night on the couch that probably the Or example wouldn't make it pass the type, because even though there's a withFilter on there, the argument doesn't have the expected type. So probably it would still fail there and I'd need the whole enchilada. Would be nice if that could be accomplished someday.

@aloiscochard
Copy link

On 4 September 2014 21:47, Paolo G. Giarrusso [email protected]
wrote:

@milessabin https://github.com/milessabin while I would love to turn
SIP-18 against itself, the checkFeature part is in Typers and I don't think
it can be applied a phase which is this early.

Ignoring the code, for SIP-18 you need to check whether implicits of some
type are available, and there's no way to know that before typechecking.

IIUC, Haskell-style flags can be understood by the parser so they wouldn't
have this problem. They're more limited, but I've not seen yet anybody
using the extra power.
We're going to have also other flags which affect pre-Typer behavior, so
we might consider a Haskell-like syntax for Typelevel language flags.

I thought we might have that issue :-/

If we need flags which work pre-typer, we should probably then all make
them (even the non pre-typer) work with a Haskell-like syntax for sake of
consistency.

Λ\ois
http://twitter.com/aloiscochard
http://github.com/aloiscochard

@typelevel-bot
Copy link

Can one of the admins verify this patch?

@puffnfresh
Copy link
Author

add to whitelist

@puffnfresh puffnfresh force-pushed the feature/irrefutable-generator-patterns branch from 8385169 to 7ac3809 Compare September 24, 2014 16:51
…tterns

Conflicts:
	bincompat-forward.whitelist.conf
	src/compiler/scala/tools/nsc/settings/ScalaSettings.scala
@aloiscochard
Copy link

retest this please

non added a commit that referenced this pull request Sep 24, 2014
…atterns

Add flag to disable withFilter pattern desugaring
@non non merged commit d76ccf3 into typelevel:2.11.x Sep 24, 2014
puffnfresh pushed a commit to puffnfresh/scala that referenced this pull request Jul 30, 2015
Under `-Ydelambdafy:method`, a public, static accessor method is
created to expose the private method containing the body of the
lambda.

Currently this accessor method has its parameters in the same order
structure as those of the lambda body method.

What is this order? There are three categories of parameters:

  1. lambda parameters
  2. captured parameters (added by lambdalift)
  3. self parameters (added to lambda bodies that end up in trait
     impl classes by mixin, and added unconditionally to the static
     accessor method.)

These are currently emitted in order typelevel#3, typelevel#1, typelevel#2.

Here are examples of the current behaviour:

BEFORE (trait):

```
% cat sandbox/test.scala && scalac-hash v2.11.5 -Ydelambdafy:method sandbox/test.scala && javap -private -classpath . 'Test$class'
trait Member; class Capture; trait LambdaParam
trait Test {
  def member: Member
  def foo {
    val local = new Capture
    (arg: LambdaParam) => "" + arg + member + local
  }
}
Compiled from "test.scala"
public abstract class Test$class {
  public static void foo(Test);
  private static final java.lang.String $anonfun$1(Test, LambdaParam, Capture);
  public static void $init$(Test);
  public static final java.lang.String accessor$1(Test, LambdaParam, Capture);
}
```

BEFORE (class):

```
% cat sandbox/test.scala && scalac-hash v2.11.5 -Ydelambdafy:method sandbox/test.scala && javap -private -classpath . Test
trait Member; class Capture; trait LambdaParam
abstract class Test {
  def member: Member
  def foo {
    val local = new Capture
    (arg: LambdaParam) => "" + arg + member + local
  }
}
Compiled from "test.scala"
public abstract class Test {
  public abstract Member member();
  public void foo();
  private final java.lang.String $anonfun$1(LambdaParam, Capture);
  public Test();
  public static final java.lang.String accessor$1(Test, LambdaParam, Capture);
}
```

Contrasting the class case with Java:

```
% cat sandbox/Test.java && javac -d . sandbox/Test.java && javap -private -classpath . Test
public abstract class Test {
  public static class Member {};
  public static class Capture {};
  public static class LambaParam {};
  public static interface I {
    public abstract Object c(LambaParam arg);
  }
  public abstract Member member();
  public void test() {
    Capture local = new Capture();
    I i1 = (LambaParam arg) -> "" + member() + local;
  }
}

Compiled from "Test.java"
public abstract class Test {
  public Test();
  public abstract Test$Member member();
  public void test();
  private java.lang.Object lambda$test$0(Test$Capture, Test$LambaParam);
}
```

We can see that in Java 8 lambda parameters come after captures. If we
want to use Java's LambdaMetafactory to spin up our anoymous FunctionN
subclasses on the fly, our ordering must change.

I can see three options for change:

  1. Adjust `LambdaLift` to always prepend captured parameters,
     rather than appending them. I think we could leave `Mixin` as
     it is, it already prepends the self parameter. This would result
     a parameter ordering, in terms of the list above: typelevel#3, typelevel#2, typelevel#1.
  2. More conservatively, do this just for methods known to hold
     lambda bodies. This might avoid needlessly breaking code that
     has come to depend on our binary encoding.
  3. Adjust the parameters of the accessor method only. The body
     of this method can permute params before calling the lambda
     body method.

This commit implements option typelevel#2.

In also prototyped typelevel#1, and found it worked so long as I limited it to
non-constructors, to sidestep the need to make corresponding
changes elsewhere in the compiler to avoid the crasher shown
in the enclosed test case, which was minimized from a bootstrap
failure from an earlier a version of this patch.

We would need to defer option typelevel#1 to 2.12 in any case, as some of
these lifted methods are publicied by the optimizer, and we must
leave the signatures alone to comply with MiMa.

I've included a test that shows this in all in action. However, that
is currently disabled, as we don't have a partest category for tests
that require Java 8.
folone pushed a commit that referenced this pull request Aug 14, 2015
The log messages intented to chronicle implicit search were
always being filtered out by virtue of the fact that the the tree
passed to `printTyping` was already typed, (e.g. with an implicit
MethodType.)

This commit enabled printing in this case, although it still
filters out trees that are deemed unfit for typer tracing,
such as `()`. In the context of implicit search, this happens
to filter out the noise of:

```
|    |    |    [search #2] start `()`, searching for adaptation to pt=Unit => Foo[Int,Int] (silent: value <local Test> in Test) implicits disabled
|    |    |    [search #3] start `()`, searching for adaptation to pt=(=> Unit) => Foo[Int,Int] (silent: value <local Test> in Test) implicits disabled
|    |    |    \-> <error>
```

... which I think is desirable.

The motivation for this fix was to better display the interaction
between implicit search and type inference. For instance:

```
class Foo[A, B]
class Test {
  implicit val f: Foo[Int, String] = ???
  def t[A, B](a: A)(implicit f: Foo[A, B]) = ???
  t(1)
}
```

````
% scalac -Ytyper-debug sandbox/instantiate.scala
...
|    |-- t(1) BYVALmode-EXPRmode (site: value <local Test> in Test)
|    |    |-- t BYVALmode-EXPRmode-FUNmode-POLYmode (silent: value <local Test> in Test)
|    |    |    [adapt] [A, B](a: A)(implicit f: Foo[A,B])Nothing adapted to [A, B](a: A)(implicit f: Foo[A,B])Nothing
|    |    |    \-> (a: A)(implicit f: Foo[A,B])Nothing
|    |    |-- 1 BYVALmode-EXPRmode-POLYmode (site: value <local Test> in Test)
|    |    |    \-> Int(1)
|    |    solving for (A: ?A, B: ?B)
|    |    solving for (B: ?B)
|    |    [search #1] start `[A, B](a: A)(implicit f: Foo[A,B])Nothing` inferring type B, searching for adaptation to pt=Foo[Int,B] (silent: value <local Test> in Test) implicits disabled
|    |    [search #1] considering f
|    |    [adapt] f adapted to => Foo[Int,String] based on pt Foo[Int,B]
|    |    [search #1] solve tvars=?B, tvars.constr= >: String <: String
|    |    solving for (B: ?B)
|    |    [search #1] success inferred value of type Foo[Int,=?String] is SearchResult(Test.this.f, TreeTypeSubstituter(List(type B),List(String)))
|    |    |-- [A, B](a: A)(implicit f: Foo[A,B])Nothing BYVALmode-EXPRmode (site: value <local Test> in Test)
|    |    |    \-> Nothing
|    |    [adapt] [A, B](a: A)(implicit f: Foo[A,B])Nothing adapted to [A, B](a: A)(implicit f: Foo[A,B])Nothing
|    |    \-> Nothing
```
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.