Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with Spark 3.2 + Scala 2.13 #7399

Open
emartinezs44 opened this issue Feb 1, 2023 · 13 comments
Open

Compatibility with Spark 3.2 + Scala 2.13 #7399

emartinezs44 opened this issue Feb 1, 2023 · 13 comments
Assignees

Comments

@emartinezs44
Copy link

I would like to know if there are any plans to compile bigdl-dllib with Scala 2.13, and publish the artifact. Currently I´m working with Scala 3 in some research and I would like to present a paper for the next Scaladays using new lang features + some bigdl examples.

Thanks!

@qiuxin2012 qiuxin2012 self-assigned this Feb 2, 2023
@qiuxin2012
Copy link
Contributor

@emartinezs44 Bad news, I tried to upgrade dllib to scala 2.13, but only one of dllib's dependency xgboost4j didn't support scala 2.13. dmlc/xgboost#6596

@jason-dai
Copy link
Contributor

@emartinezs44 Bad news, I tried to upgrade dllib to scala 2.13, but only one of dllib's dependency xgboost4j didn't support scala 2.13. dmlc/xgboost#6596

Maybe we can release an experimental dllib package for Spark 3.2+scala 3.13 without xgboost support

@emartinezs44
Copy link
Author

I see, this kind of dependences are a problem(having in mind that Spark 3.2 was released a year and a half ago). By now, the only solution is to create a profile and exclude the dependence and the code in Scala referencing xgboost. I will try to create a branch in my fork with this config and see if it compiles or if there are another problems.

@qiuxin2012
Copy link
Contributor

qiuxin2012 commented Feb 3, 2023

I have a branch https://github.com/qiuxin2012/BigDL/tree/scala3, I'm working on the compatiability with scala 2.13
100+ build errors, I'm tring to fix them.

@qiuxin2012
Copy link
Contributor

@emartinezs44 I'm blocking by a "ambiguous implicit values" error, could you help?
You can use ./make-dist.sh -P spark_3.x -P scala_2.13 -Dspark.version=3.2.3 to reproduce the error message, on my branch https://github.com/qiuxin2012/BigDL/tree/scala3.

[ERROR] /home/xin/IdeaProjects/BigDL2/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/tensor/DenseTensor.scala:239: ambiguous implicit values:
 both value evidence$1 in class DenseTensor of type scala.reflect.ClassTag[T]
 and value evidence$9 of type scala.reflect.ClassTag[T]
 match expected type scala.reflect.ClassTag[T]
[ERROR]     DenseTensor.newWithStorage(this, storage, _storageOffset, _size, _stride, ev)
[ERROR]                               ^
[ERROR] /home/xin/IdeaProjects/BigDL2/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/tensor/DenseTensor.scala:234: ambiguous implicit values:
 both value evidence$1 in class DenseTensor of type scala.reflect.ClassTag[T]
 and value evidence$9 of type scala.reflect.ClassTag[T]
 match expected type scala.reflect.ClassTag[T]
[ERROR]   private[tensor] def this(storage: ArrayStorage[T])(implicit ev: TensorNumeric[T]) = {
[ERROR]                                                                                       ^
[ERROR] /home/xin/IdeaProjects/BigDL2/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/tensor/DenseTensor.scala:249: ambiguous implicit values:
 both value evidence$1 in class DenseTensor of type scala.reflect.ClassTag[T]
 and value evidence$10 of type scala.reflect.ClassTag[T]
 match expected type scala.reflect.ClassTag[T]
[ERROR]       DenseTensor.newWithStorage[T](this, storage, _storageOffset, _size, _stride, ev)
[ERROR]                                    ^
[ERROR] /home/xin/IdeaProjects/BigDL2/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/tensor/DenseTensor.scala:245: ambiguous implicit values:
 both value evidence$1 in class DenseTensor of type scala.reflect.ClassTag[T]
 and value evidence$10 of type scala.reflect.ClassTag[T]
 match expected type scala.reflect.ClassTag[T]
[ERROR]     if (storage != null) {
[ERROR]                          ^
[ERROR] /home/xin/IdeaProjects/BigDL2/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/tensor/DenseTensor.scala:261: ambiguous implicit values:
 both value evidence$1 in class DenseTensor of type scala.reflect.ClassTag[T]
 and value evidence$11 of type scala.reflect.ClassTag[T]
 match expected type scala.reflect.ClassTag[T]
[ERROR]     DenseTensor.newWithStorage[T](this, _storage, _storageOffset, _size, _stride, ev)
[ERROR]                                  ^
[ERROR] /home/xin/IdeaProjects/BigDL2/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/tensor/DenseTensor.scala:253: ambiguous implicit values:
 both value evidence$1 in class DenseTensor of type scala.reflect.ClassTag[T]
 and value evidence$11 of type scala.reflect.ClassTag[T]
 match expected type scala.reflect.ClassTag[T]
[ERROR]   private[tensor] def this(other: Tensor[T])(implicit ev: TensorNumeric[T]) = {
[ERROR]                                                                               ^
[ERROR] /home/xin/IdeaProjects/BigDL2/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/tensor/QuantizedTensor.scala:118: ambiguous implicit values:
 both value evidence$1 in class QuantizedTensor of type scala.reflect.ClassTag[T]
 and value evidence$2 of type scala.reflect.ClassTag[T]
 match expected type scala.reflect.ClassTag[T]
[ERROR]     this.desc = Desc.get(params, null, 0, null, null)
[ERROR]                         ^
[ERROR] /home/xin/IdeaProjects/BigDL2/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/tensor/QuantizedTensor.scala:122: ambiguous implicit values:
 both value evidence$1 in class QuantizedTensor of type scala.reflect.ClassTag[T]
 and value evidence$3 of type scala.reflect.ClassTag[T]
 match expected type scala.reflect.ClassTag[T]
[ERROR]     implicit ev: TensorNumeric[T]) = {
[ERROR]                                      ^
[ERROR] /home/xin/IdeaProjects/BigDL2/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/tensor/QuantizedTensor.scala:126: ambiguous implicit values:
 both value evidence$1 in class QuantizedTensor of type scala.reflect.ClassTag[T]
 and value evidence$3 of type scala.reflect.ClassTag[T]
 match expected type scala.reflect.ClassTag[T]
[ERROR]     this.desc = Desc.get(descParams, this.internalStorage, 0, this.maxOfRow, this.minOfRow)
[ERROR]                         ^
[ERROR] /home/xin/IdeaProjects/BigDL2/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/tensor/QuantizedTensor.scala:130: ambiguous implicit values:
 both value evidence$1 in class QuantizedTensor of type scala.reflect.ClassTag[T]
 and value evidence$4 of type scala.reflect.ClassTag[T]
 match expected type scala.reflect.ClassTag[T]
[ERROR]     descParams: DescParams)(implicit ev: TensorNumeric[T]) = {
[ERROR]                                                              ^
[ERROR] /home/xin/IdeaProjects/BigDL2/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/tensor/QuantizedTensor.scala:140: ambiguous implicit values:
 both value evidence$1 in class QuantizedTensor of type scala.reflect.ClassTag[T]
 and value evidence$4 of type scala.reflect.ClassTag[T]
 match expected type scala.reflect.ClassTag[T]
[ERROR]     this.desc = Desc.get(descParams, this.internalStorage, 0, this.maxOfRow, this.minOfRow)
[ERROR]                         ^

@emartinezs44
Copy link
Author

let me check it out

@emartinezs44
Copy link
Author

emartinezs44 commented Feb 6, 2023

Well, there is a problem related to the implicit evidences that the compiler creates.

This is with Scala 2.13

private[tensor] class DenseTensor[@specialized T: ClassTag]( ...

It includes an implicit evidence for the ClassTag[T]. Here a simplified output of the 'log-implicits' scalac compiler for the same problem but much simpler:

private class CustomArray[@specialized T] extends scala.AnyRef {
   <paramaccessor> private[this] var storage: Array[T] = _;
   <accessor> <paramaccessor> private def storage: Array[T] = CustomArray.this.storage;
   <accessor> <paramaccessor> private def storage_=(x$1: Array[T]): Unit = CustomArray.this.storage = x$1;

   // Here the evidence created with the [@specialized T: ClassTag]
   implicit <synthetic> <paramaccessor> private[this] val evidence$1: scala.reflect.ClassTag[T] = _; 
   
   def <init>(storage: Array[T])(implicit evidence$1: scala.reflect.ClassTag[T]): CustomArray[T] = {
     CustomArray.super.<init>();
     ()
   };

   /** the constructor here includes another evidence because it is required. So we have two ClassTag evidences in scope **/ 
   private def <init>(storage: Array[T], length: Int)(implicit evidence$2: scala.reflect.ClassTag[T]): CustomArray[T] = {
     CustomArray.this.<init>(null)(evidence$2);
     /** The compiler here does not know which one include and .... Error **/
     CustomArray.newWithStorage[T](this, storage)();
     ()
   }
 };
 object CustomArray extends scala.AnyRef {
   def <init>(): CustomArray.type = {
     CustomArray.super.<init>();
     ()
   };
   private def newWithStorage[@specialized(scala.Float, scala.Double) T](customArray: CustomArray[T], in: Array[T])(implicit evidence$3: scala.reflect.ClassTag[T]): CustomArray[T] = {
     customArray.storage_=(in);
     customArray
   }
 }
}

These kind of conflicts were resolved in Scala 2.12 taking the evidence created in the constructor call and passing it directly to the function:

For Scala 2.12:

private class CustomArray[@specialized T] extends scala.AnyRef {
    <paramaccessor> private[this] var storage: Array[T] = _;
    <accessor> <paramaccessor> private def storage: Array[T] = CustomArray.this.storage;
    <accessor> <paramaccessor> private def storage_=(x$1: Array[T]): Unit = CustomArray.this.storage = x$1;
    implicit <paramaccessor> private[this] val ev: scala.reflect.ClassTag[T] = _;
    def <init>(storage: Array[T])(implicit ev: scala.reflect.ClassTag[T]): CustomArray[T] = {
      CustomArray.super.<init>();
      ()
    };
    private def <init>(storage: Array[T], length: Int)(implicit ev: scala.reflect.ClassTag[T]): CustomArray[T] = {
      CustomArray.this.<init>(null)(ev);
      /** Here the dependence is resolved by the compiler including explicitly the evidence when calling the 'newWithStorage function */
      CustomArray.newWithStorage[T](this, storage)(ev);
      ()
    }
  };
  object CustomArray extends scala.AnyRef {
    def <init>(): CustomArray.type = {
      CustomArray.super.<init>();
      ()
    };
    private def newWithStorage[@specialized(scala.Float, scala.Double) T](customArray: CustomArray[T], in: Array[T])(implicit evidence$1: scala.reflect.ClassTag[T]): CustomArray[T] = {
      customArray.storage_=(in);
      customArray
    }
  }

SOLUTION:

This solution is valid for Scala 2.12, and consists in declaring the implicit in the class definition:

private[tensor] class DenseTensor[@specialized T](
  private[tensor] var _storage: ArrayStorage[T],
  private[tensor] var _storageOffset: Int,
  private[tensor] var _size: Array[Int],
  private[tensor] var _stride: Array[Int],
  var nDimension: Int)(implicit ev: TensorNumeric[T], /** HERE **/ ev100: ClassTag[T])
  extends Tensor[T] {

and in the constructors that create the conflict like:

private[tensor] def this(other: Tensor[T])(implicit ev: TensorNumeric[T])

changing the constructor definition like:

 /** including the implicit reference in the constructor definition **/
  private[tensor] def this(other: Tensor[T])(implicit ev: TensorNumeric[T], ev1: ClassTag[T]) = {
    this(null, 0, null, null, 0)
    Log4Error.unKnowExceptionError(other.isInstanceOf[DenseTensor[_]],
      "Only support dense tensor in this operation")
    val _storage = other.storage().asInstanceOf[ArrayStorage[T]]
    val _storageOffset = other.storageOffset() - 1
    val _size = other.size()
    val _stride = other.stride()
    /** pass this evidence explicitly **/
    DenseTensor.newWithStorage[T](this, _storage, _storageOffset, _size, _stride, ev)(ev1)
  }

This resolves the problems in DenseTensor definition, maybe there are other related to this ambiguity problem. Take a look and tell me if you can handle it or if you need me to work in your branch.

@qiuxin2012
Copy link
Contributor

Well, there is a problem related to the implicit evidences that the compiler creates.

This is with Scala 2.13

private[tensor] class DenseTensor[@specialized T: ClassTag]( ...

It includes an implicit evidence for the ClassTag[T]. Here a simplified output of the 'log-implicits' scalac compiler for the same problem but much simpler:

private class CustomArray[@specialized T] extends scala.AnyRef {
   <paramaccessor> private[this] var storage: Array[T] = _;
   <accessor> <paramaccessor> private def storage: Array[T] = CustomArray.this.storage;
   <accessor> <paramaccessor> private def storage_=(x$1: Array[T]): Unit = CustomArray.this.storage = x$1;

   // Here the evidence created with the [@specialized T: ClassTag]
   implicit <synthetic> <paramaccessor> private[this] val evidence$1: scala.reflect.ClassTag[T] = _; 
   
   def <init>(storage: Array[T])(implicit evidence$1: scala.reflect.ClassTag[T]): CustomArray[T] = {
     CustomArray.super.<init>();
     ()
   };

   /** the constructor here includes another evidence because it is required. So we have two ClassTag evidences in scope **/ 
   private def <init>(storage: Array[T], length: Int)(implicit evidence$2: scala.reflect.ClassTag[T]): CustomArray[T] = {
     CustomArray.this.<init>(null)(evidence$2);
     /** The compiler here does not know which one include and .... Error **/
     CustomArray.newWithStorage[T](this, storage)();
     ()
   }
 };
 object CustomArray extends scala.AnyRef {
   def <init>(): CustomArray.type = {
     CustomArray.super.<init>();
     ()
   };
   private def newWithStorage[@specialized(scala.Float, scala.Double) T](customArray: CustomArray[T], in: Array[T])(implicit evidence$3: scala.reflect.ClassTag[T]): CustomArray[T] = {
     customArray.storage_=(in);
     customArray
   }
 }
}

These kind of conflicts were resolved in Scala 2.12 taking the evidence created in the constructor call and passing it directly to the function:

For Scala 2.12:

private class CustomArray[@specialized T] extends scala.AnyRef {
    <paramaccessor> private[this] var storage: Array[T] = _;
    <accessor> <paramaccessor> private def storage: Array[T] = CustomArray.this.storage;
    <accessor> <paramaccessor> private def storage_=(x$1: Array[T]): Unit = CustomArray.this.storage = x$1;
    implicit <paramaccessor> private[this] val ev: scala.reflect.ClassTag[T] = _;
    def <init>(storage: Array[T])(implicit ev: scala.reflect.ClassTag[T]): CustomArray[T] = {
      CustomArray.super.<init>();
      ()
    };
    private def <init>(storage: Array[T], length: Int)(implicit ev: scala.reflect.ClassTag[T]): CustomArray[T] = {
      CustomArray.this.<init>(null)(ev);
      /** Here the dependence is resolved by the compiler including explicitly the evidence when calling the 'newWithStorage function */
      CustomArray.newWithStorage[T](this, storage)(ev);
      ()
    }
  };
  object CustomArray extends scala.AnyRef {
    def <init>(): CustomArray.type = {
      CustomArray.super.<init>();
      ()
    };
    private def newWithStorage[@specialized(scala.Float, scala.Double) T](customArray: CustomArray[T], in: Array[T])(implicit evidence$1: scala.reflect.ClassTag[T]): CustomArray[T] = {
      customArray.storage_=(in);
      customArray
    }
  }

SOLUTION:

This solution is valid for Scala 2.12, and consists in declaring the implicit in the class definition:

private[tensor] class DenseTensor[@specialized T](
  private[tensor] var _storage: ArrayStorage[T],
  private[tensor] var _storageOffset: Int,
  private[tensor] var _size: Array[Int],
  private[tensor] var _stride: Array[Int],
  var nDimension: Int)(implicit ev: TensorNumeric[T], /** HERE **/ ev100: ClassTag[T])
  extends Tensor[T] {

and in the constructors that create the conflict like:

private[tensor] def this(other: Tensor[T])(implicit ev: TensorNumeric[T])

changing the constructor definition like:

 /** including the implicit reference in the constructor definition **/
  private[tensor] def this(other: Tensor[T])(implicit ev: TensorNumeric[T], ev1: ClassTag[T]) = {
    this(null, 0, null, null, 0)
    Log4Error.unKnowExceptionError(other.isInstanceOf[DenseTensor[_]],
      "Only support dense tensor in this operation")
    val _storage = other.storage().asInstanceOf[ArrayStorage[T]]
    val _storageOffset = other.storageOffset() - 1
    val _size = other.size()
    val _stride = other.stride()
    /** pass this evidence explicitly **/
    DenseTensor.newWithStorage[T](this, _storage, _storageOffset, _size, _stride, ev)(ev1)
  }

This resolves the problems in DenseTensor definition, maybe there are other related to this ambiguity problem. Take a look and tell me if you can handle it or if you need me to work in your branch.

Thanks for you help, I will try your solution later.
I have used a workaround to solve the ambiguity problem. I moved DenseTensor.newWithStorage[T] out of the constructor, and define an apply method to instead of this constucutor.

All compile errors are all fixed now, and the unit tests is running.

@qiuxin2012
Copy link
Contributor

@emartinezs44 We have released a snapshot version with Spark 3.2 and Scala 2.13 to
https://oss.sonatype.org/content/repositories/snapshots/com/intel/analytics/bigdl/bigdl-dllib-spark_3.2.3/2.3.0-SNAPSHOT/
Please info us if you meet any error.

@emartinezs44
Copy link
Author

emartinezs44 commented Feb 7, 2023

Wonderful!!!, I try as as soon as possible. Thank you very much!

@qiuxin2012
Copy link
Contributor

The changes of scala 2.13 support are merged to https://github.com/intel-analytics/BigDL/tree/scala-2.13

We still has some failures due to:

  1. orca scala-2.13 build failed
  2. spark 2.4 scala2.10 build failed due to scala-java Map conversion API missmatch.
  3. xgboost don't support scala 2.13

Please info us, if you has more requirements.

@emartinezs44
Copy link
Author

It seems to work well. Some aspects related to implicit conversions, which is normal having in mind that the Scala 3 compiler is a new compiler. But it looks great, let me try more in other scenarios.

@s0t00524
Copy link

Hi. xgboost4j now supports Scala 2.13.
I would like to hear if there are any plans to release bigdl-dllib compiled with Scala 2.13.

dmlc/xgboost#9099
https://mvnrepository.com/artifact/ml.dmlc/xgboost4j

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants