Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect bytecode generated for StringBuilder.max #4214

Closed
scabug opened this issue Feb 4, 2011 · 40 comments
Closed

Incorrect bytecode generated for StringBuilder.max #4214

scabug opened this issue Feb 4, 2011 · 40 comments

Comments

@scabug
Copy link

scabug commented Feb 4, 2011

=== Background ===
While trying to weave a simple AspectJ aspect is a Scala compiler jar,
we got a puzzling error that fails the AspectJ weaver if we use
Scala version 2.8.1 or 2.9.0-SNAPSHOT, but not 2.8.0.
Upon closer inspection (see below) there appears to be a problem
due to inconsistent bytecode generated by the Scala compiler.
This not only fails with AspectJ, but also
crashes the Eclipse compiler in a plain (non-AspectJ) Java project.

Here is the analysis based on input from AspectJ project lead Andy Clement.

The issue is with StringBuilder.max (with 2.8.1)

=== What steps will reproduce the problem ? ===
Here is the bytecode generated by the Scala compiler:

public java.lang.Object max(scala.math.Ordering);
 Code:
  Stack=2, Locals=2, Args_size=2
  0:   aload_0
  1:   aload_1
  2:   invokestatic    SI-1024; //Method
scala/collection/TraversableOnce$$class.max:(Lscala/collection/TraversableOnce;Lscala/math/Ordering;)Ljava/lang/Object;
  5:   areturn
 Signature: length = 0x2
  03 FFFFFFFD

So it calls another method then finishes with an ARETURN (i.e.
returning an object, not a primitive). Now it is generic so we can
lookup the original signature using that attribute, the generic
signature is:

 <B:Ljava/lang/Object;>(Lscala/math/Ordering<TB;>;)C;

and that says that the method returns a char (the trailing C), so it
should have been declared 'char max(Ordering)' and finish with an
IRETURN.

Now, the compiler may be being clever and looking for the boxing to a
Character, but it seems odd that the generic signature disagrees
with the bytecode signature in that way.

In Scala 2.8.0 it was not a generic method so there is no mismatch in
signatures.

=== Seeing the problem without involving AspectJ ===
If you want to see the Eclipse compiler crashing, create a pure java project,
give it a dependency on the 2.8.1 scala library then try to compile
this:

import scala.collection.mutable.StringBuilder;

public class CC {
       public void m() {
               StringBuilder sb = new StringBuilder();
       }
 }

Eclipse throws up an error message (an Internal compiler error)

Internal compiler error: java.lang.ClassCastException: 
org.eclipse.jdt.internal.compiler.lookup.BaseTypeBinding cannot be cast to 
org.eclipse.jdt.internal.compiler.lookup.ReferenceBinding at 
org.eclipse.jdt.internal.compiler.lookup.BinaryTypeBinding.initializeTypeVariable(BinaryTypeBinding.java:944)	CC.java	/scala-library-bug/src/bug	
line 0

I can attach the project if you wish, but as you can see,
there isn't much needed to get the Eclipse compiler to crash.

=== What versions of the following are you using? ===

  • Scala: 2.8.1 and 2.9.0-SNAPSHOT (2.8.0 works fine)
  • Java: 1.6.0_22
  • Operating system: Mac Snow Leopard
@scabug
Copy link
Author

scabug commented Feb 4, 2011

Imported From: https://issues.scala-lang.org/browse/SI-4214?orig=1
Reporter: Ramnivas Laddad (ramnivas)

@scabug
Copy link
Author

scabug commented Feb 4, 2011

@paulp said:
This is extremely helpful, thank you. I have made alterations such that syntactically invalid signatures can no longer be geneerated, but this is what I was afraid was still out there. This is important, but right now I don't know how fast I'll be able to get to it.

(See also #4067 which I was hoping this would be a duplicate of, but is not.)

@scabug
Copy link
Author

scabug commented Feb 14, 2011

@axel22 said:
Paul, can I assign this to you?

@scabug
Copy link
Author

scabug commented Feb 14, 2011

@paulp said:
Assignment assigned.

@scabug
Copy link
Author

scabug commented Feb 16, 2011

@SethTisue said:
for whatever it's worth, I'm hitting this issue trying to use ENSIME with Scala 2.8.1

@scabug
Copy link
Author

scabug commented Feb 21, 2011

@odersky said:
The wrong signature is generated for a method that forwards to a trait implementation. Easiest fix would be to leave out the Scala signatures for trait forwarders. You'll get them in any case on the trait interface itself.

@scabug
Copy link
Author

scabug commented Feb 21, 2011

@odersky said:
(In r24319) Attempt to fix #4214 by avoiding signature generation for mixed in and bridge methods.

@scabug
Copy link
Author

scabug commented Feb 21, 2011

@odersky said:
r24319 avoids signature generation for bridge methods and mixed in methods. I ran out of time, so could not verify that it does address the problem. If there are unintended side effects, please revert.

@scabug
Copy link
Author

scabug commented Feb 24, 2011

@paulp said:
(In r24344) Reversion of r24319 for causing #4291. Test case to fence out #4291
in future attempts. Closes #4291, references #4214, no review.

@scabug
Copy link
Author

scabug commented Feb 25, 2011

@SethTisue said:
another Ensime user is affected too, see https://github.com/aemoncannon/ensime/issues/#issue/138

@scabug
Copy link
Author

scabug commented Mar 2, 2011

@paulp said:
(In r24363) Another lap around the track with generic signatures.
At the root of the issue reported in #4214 is our old friend
(fondly remembered from the days of primitive equality)
boxed/primitive unification.

  trait T[A] {
    def f(): A
  }
  // the generic signature spec doesn't allow for parameterizing
  // on primitive types, so this cannot remain Char.  However
  // translating it to Character, as was done, also has issues.
  class C extends T[Char] {
    def f(): Char = 'a'
  }

  // Note that neither of the signatures for f, the implementation
  // or the bridge method, matches the type parameter.
  Generic interfaces in class:
    T<java.lang.Character>
  Generic signatures:
    public char C.f()
    public java.lang.Object C.f()

After this commit, primitive type parameters are translated into
Object instead of the boxed type. It was martin's idea, so no review.
Closes #4214.

@scabug
Copy link
Author

scabug commented Mar 3, 2011

Ramnivas Laddad (ramnivas) said:
Thanks. Is there a snapshot version with this fix I can try? The latest one at http://scala-tools.org/repo-snapshots seems to be from Feb 27.

@scabug
Copy link
Author

scabug commented Mar 3, 2011

@paulp said:
Replying to [comment:15 ramnivas]:

Thanks. Is there a snapshot version with this fix I can try? The latest one at http://scala-tools.org/repo-snapshots seems to be from Feb 27.

Yeah, I don't know what's going on with the maven deploy failing but I think I woke up the appropriate parties today. You can wait until that's fixed or download a nightly from http://www.scala-lang.org/node/212/distributions .

@scabug
Copy link
Author

scabug commented Mar 9, 2011

@paulp said:
Replying to [comment:15 ramnivas]:

Thanks. Is there a snapshot version with this fix I can try? The latest one at http://scala-tools.org/repo-snapshots seems to be from Feb 27.

FYI now there are up to date snapshots, and any followup report positive or negative would be useful.

@scabug
Copy link
Author

scabug commented Mar 9, 2011

Ramnivas Laddad (ramnivas) said:
I tried this a couple of days back with my original code (weaving AspectJ into Scala). First, I met with an issue that is now resolved in AspectJ nightly. Now I still see a VerifyError in the same place (class: scala/collection/mutable/StringBuilder, method: minBy signature: (Lscala/Function1;Lscala/math/Ordering;)Ljava/lang/Object;) Expecting to find integer on stack). However, I no longer see Eclipse compiler crash as I reported in the original error. Currently, I am working with the AspectJ lead to figure out what's causing that error and it may very well turn out to be an AspectJ issue. I will report back as soon as we have some lead.

@scabug
Copy link
Author

scabug commented Mar 9, 2011

@paulp said:
Replying to [comment:18 ramnivas]:

I tried this a couple of days back with my original code (weaving AspectJ into Scala). First, I met with an issue that is now resolved in AspectJ nightly. Now I still see a VerifyError in the same place (class: scala/collection/mutable/StringBuilder, method: minBy signature: (Lscala/Function1;Lscala/math/Ordering;)Ljava/lang/Object;)

A couple days back is probably too far, as I was still wrestling this until fairly recently. Here is what the signature of minBy for instance looks like in the current nightly; I think you will only see the error above if you have an older build.

scala> classOf[StringBuilder].getMethods.filter(_.getName == "minBy").head.toGenericString
res0: java.lang.String = public <B> char scala.collection.mutable.StringBuilder.minBy(scala.Function1<java.lang.Object, B>,scala.math.Ordering<B>)

@scabug
Copy link
Author

scabug commented Mar 9, 2011

Ramnivas Laddad (ramnivas) said:
I have been keeping up to date (well, Maven did due to SNAPSHOT dependencies). Since AspectJ weaving went past the original problem and revealed one issue with AspectJ itself (fixed since), I thought I am probably using a fixed version.

I tried just now (scala-library-2.9.0-20110309.025033-152.jar and scala-compiler-2.9.0-20110309.025033-152.jar) and still see the same problem.

I can share the project if you want to try reproduce it (it is very small project). Just let me know.

@scabug
Copy link
Author

scabug commented Mar 9, 2011

@paulp said:
Replying to [comment:20 ramnivas]:

I tried just now (scala-library-2.9.0-20110309.025033-152.jar and scala-compiler-2.9.0-20110309.025033-152.jar) and still see the same problem.

Well that's a disappointment.

I can share the project if you want to try reproduce it (it is very small project). Just let me know.

I certainly would like to fix the problem, but I am doubtful that anything involving eclipse and aspectj (neither of which I use and both of which I have terrible luck with) is amenable to straightforward reproduction. But please send me the project anyway. I'm [email protected] if you don't want to attach something here.

@scabug
Copy link
Author

scabug commented Mar 9, 2011

@paulp said:
Oh, it helpfully obscured my email. The domain part is improving.org.

@scabug
Copy link
Author

scabug commented Mar 9, 2011

Ramnivas Laddad (ramnivas) said:
Paul,

I sent you the project.

I have additional input from the AspectJ lead. Problem still looks like StringBuilder. The signature of minBy (and similarly maxBy) in the bytecode is:

public Object minBy(scala.Function1 f, scala.math.Ordering cmp)

but the generic signature attribute for it is:

<B:Ljava/lang/Object;>(Lscala/Function1<Ljava/lang/Object;TB;>;Lscala/math/Ordering<TB;>;)C

The trailing 'C' indicating the return type is of type primitive char

  • this doesn't match the bytecode return type of Object.

Here is a crude reflection test program - it discovers the bytecode
return type and the generic return type.

import java.lang.reflect.Method;
import java.lang.reflect.Type;

import scala.collection.mutable.StringBuilder;

public class Main {
       public static void main(String[] args) {
               StringBuilder sb = new StringBuilder();
               Method[] ms = StringBuilder.class.getDeclaredMethods();
               for (Method m : ms) {
                       if (m.getName().equals("minBy")) {
                               Class<?> c = m.getReturnType();
                               System.out.println("Bytecode return type: " + c);
                               Type t = m.getGenericReturnType();
                               System.out.println("Generic return type : " + t);
                               if (t instanceof Class) {
                                       if (((Class) t).isPrimitive() && !c.isPrimitive()) {
                                               System.out.println("Incompatible");
                                       }
                               }
                       }
               }
       }
}

@scabug
Copy link
Author

scabug commented Mar 10, 2011

@paulp said:
We're making progress anyway. More to come.

@scabug
Copy link
Author

scabug commented Mar 23, 2011

@odersky said:
This should be fixed by now. Leaving to Paul to verify.

@scabug
Copy link
Author

scabug commented Mar 23, 2011

@odersky said:
I assume it's fixed anyway by r24551 (scala/scala@0444c81). If people still see problems, please reopen.

@scabug
Copy link
Author

scabug commented Mar 25, 2011

@paulp said:
The following program generates an invalid signature. It is again the primitive type parameter problem.

import scala.collection.mutable

object Test {
  def f[A, B] = new mutable.HashMap[A, B] { }
  def g[A] = new mutable.HashMap[A, Int] { }
  def h[A] = new mutable.HashMap[A, Int] { override def default(k: A) = 0 }
}

Methods f and g are marked bridge as expected.

public B scala.collection.mutable.HashMap.default(A) (bridge) in Test$$$$anon$$2
public B scala.collection.mutable.HashMap.default(A) (bridge) in Test$$$$anon$$3

The third class gets these:

bridge=false public int Test$$$$anon$$3.default(A)
bridge=true  public java.lang.Object Test$$$$anon$$3.default(java.lang.Object)

But Int is not a valid return type for the non-bridge method, because all code based on HashMap will be looking for an object on the stack.

As far as I can see, there isn't any valid way to implement a generically defined method and have primitives appearing in the signature in type parameter positions. My only idea (this is what I was talking about at villars) is to implement additional bridges, like:

public java.lang.Integer Test$$$$anon$$3.default(java.lang.Object)

But that may be infeasible.

So the signature bombs are still in there. I found this one in SeqLike.

private def occCounts[B](sq: Seq[B]): mutable.Map[B, Int] = {
  val occ = new mutable.HashMap[B, Int] { override def default(k: B) = 0 }
  for (y <- sq.seq) occ(y) += 1
  occ
}

@scabug
Copy link
Author

scabug commented Mar 25, 2011

@paulp said:
See also #4388.

@scabug
Copy link
Author

scabug commented Apr 3, 2011

@odersky said:
Can you elaborate? What goes wrong? I did not understand:

"But Int is not a valid return type for the non-bridge method, because all code based on HashMap? will be looking for an object on the stack."

I thought, that's what the bridge method is for?

-- Martin

@scabug
Copy link
Author

scabug commented Apr 4, 2011

Ramnivas Laddad (ramnivas) said:
Just to report that the original issue I was seeing is now gone. So from my point of view, this issue may be closed.

Thank you Paul and Martin. Paul, I emailed you separately around the AspectJ issue that had been taken care of.

@scabug
Copy link
Author

scabug commented Apr 8, 2011

@odersky said:
Any further input on that, or can we close it?

@scabug scabug closed this as completed Jun 5, 2012
@scabug scabug added the critical label Apr 7, 2017
@adriaanm
Copy link
Contributor

adriaanm commented Oct 2, 2017

I came across this bug today and was trying to reconstruct how we ended up in this point in the design space. I'll try to recap (below) and allow myself a @paulp question for the effort.

My only idea (this is what I was talking about at villars) is to implement additional bridges, like:

public java.lang.Integer Test$$$$anon$$3.default(java.lang.Object)

This seems like a good idea to me

But that may be infeasible.

Are you referring to the increase in bytecode, or something else? In terms of added bytecode, it doesn't seem too bad to add another bridge method (one constant pool entry and one 5-byte method).

EDIT: I could imagine that there's another issue with this, when you complicate your example a little bit, wrapping the type parameter in a type constructor:

trait T[A] {  def f: Option[A] }
class C extends T[Char] { def f: Option[Char] = Some('a') }

This would require bridge methods whose signatures erase to the same types (one for Option<Object> and one for Option<Character>, which both erase to Option of course.

Ignoring that snag, let me recap my understanding of the issue in the straightforward case:

Ideally, a Scala "auto-boxing primitive type" such as Char would erase to the actual primitive type whenever we are at the precise receiver type (e.g. a private method can just return a char), and to the corresponding boxed Java type (j.l.Character) when we can't rule out boxing (the super class signature and resulting bridge methods).

trait T[A] {  def f: A }
class C extends T[Char] { def f: Char = 'a' }

scalac currently (2.12.3) compiles this to:

public interface T<A> {  public A f(); }
public class C implements T<Object> {
    public char f() { return 'a'; }
    public /* bridge */ /* synthetic */ Object f() { return BoxesRunTime.boxToCharacter((char)this.f()); }
}

It would be nice if we could instead have:

public class C implements T<java.lang.Character> {
    public char f() { return 'a'; }
    // bridge needed to match more precise erasure of T<A> (where A = Char) to T<java.lang.Character>
    public /* bridge */ /* synthetic */ java.lang.Character f() { return BoxesRunTime.boxToCharacter((char)this.f()); 
    // needed to meet erased signature in T<A>, where A erases to Object without further constr
    public /* bridge */ /* synthetic */ Object f() { return (Object)this.f() }; 
}

I believe this is allowed at the bytecode level (overloading on result type), but, as I said above, it breaks down when you consider return types that include the type parameter in an erased position.

@odersky
Copy link

odersky commented Oct 2, 2017

I have no idea, I am afraid. I imagine "infeasible" means there's a lot of unknowns consequences of changing bridge generation.

@adriaanm
Copy link
Contributor

adriaanm commented Oct 2, 2017

Thanks, here's one show-stopper, I think (included as a sneaky EDIT above):

I could imagine that there's another issue with this, when you complicate your example a little bit, wrapping the type parameter in a type constructor:

trait T[A] {  def f: Option[A] }
class C extends T[Char] { def f: Option[Char] = Some('a') }

This would require bridge methods whose signatures erase to the same types (one for Option<Object> and one for Option<Character>, which both erase to Option of course.

@SethTisue
Copy link
Member

this matches my recollection of past discussions of what the show-stopper was.

@sjrd
Copy link
Member

sjrd commented Oct 2, 2017

That doesn't matter. In that case only emit the bridge for Option<Character>, and precisely because it erases to the same type as Option<Object>, it will also act as a valid bridge for Option<Object>.

@adriaanm
Copy link
Contributor

adriaanm commented Oct 2, 2017

Ah, good point, thanks!

@adriaanm
Copy link
Contributor

adriaanm commented Oct 2, 2017

Hmm, but wouldn't that only be valid for covariant type constructors?

@sjrd
Copy link
Member

sjrd commented Oct 2, 2017

Nope. The JVM doesn't know about variance. It won't care.

@adriaanm
Copy link
Contributor

adriaanm commented Oct 2, 2017

Yes, but isn't it unsound in general to put something in an invariant box at, say, Box[String] and "up"cast it to Box[Any]? I'll need to sit down and come up with an example, but something feels funny about this :-)

EDIT: I guess you're shielded from the unsoundness because you only get access to these bridges indirectly from well-typed code, they don't provide more surface area.

@paulp
Copy link

paulp commented Oct 2, 2017

The JVM doesn't know about variance, but java compilers do. This difficulty all originated with people trying to use ecj and javac to compile java source against scala bytecode. My guess is that everyone who cares about interop with java has learned they have to write the java-facing part in java if they want it to work.

@paulp
Copy link

paulp commented Oct 2, 2017

Are you referring to the increase in bytecode, or something else?

I probably meant the increase in bytecode, but I might have been picturing 2^N bridge methods for N > 1 when you have e.g. (A, A) => A.

@adriaanm
Copy link
Contributor

adriaanm commented Oct 2, 2017

Yes, but the combinatorial explosion doesn't happen because all combinations erase to the same type -- ignoring Arrays and specialization. (Specialization is an orthogonal concerns, at least for the purpose of reasoning about the added bytecode due to more precise generic signatures).

Assuming Sébastien is right (and he usually is ;-)), maybe we can improve the situation after all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants