Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Cast string to decimal won't return null for out of range floats #10908

Open
thirtiseven opened this issue May 27, 2024 · 0 comments
Open
Labels
bug Something isn't working

Comments

@thirtiseven
Copy link
Collaborator

Describe the bug
Cast string to decimal won't return null for out-of-range floats, it will return the value instead of null or throw an exception.

For example 1.23E+21 cast to decimal(15,-5) will return 1.2300000000000000E+21 on GPU, but null (ansi off) or SparkArithmeticException: Decimal(expanded, 1.23E+21, 3, -19) cannot be represented as Decimal(15, -5) (ansi on) on CPU.

Steps/Code to reproduce bug

scala> val df = Seq("1.23E+21").toDF("col")
df: org.apache.spark.sql.DataFrame = [col: string]

scala> df.write.mode("OVERWRITE").parquet("TEMP")
24/05/27 10:35:34 WARN GpuOverrides:
*Exec <DataWritingCommandExec> will run on GPU
  *Output <InsertIntoHadoopFsRelationCommand> will run on GPU
  ! <LocalTableScanExec> cannot run on GPU because GPU does not currently support the operator class org.apache.spark.sql.execution.LocalTableScanExec
    @Expression <AttributeReference> col#4 could run on GPU


scala> import org.apache.spark.sql.types._
import org.apache.spark.sql.types._

scala> spark.conf.set("spark.sql.legacy.allowNegativeScaleOfDecimal", "true")

scala>

scala> spark.conf.set("spark.sql.ansi.enabled", "true")

scala> val decType = DataTypes.createDecimalType(15, -5)
decType: org.apache.spark.sql.types.DecimalType = DecimalType(15,-5)

scala> spark.read.parquet("TEMP").select(col("col").cast(decType)).show(false)
24/05/27 10:36:42 WARN GpuOverrides:
!Exec <CollectLimitExec> cannot run on GPU because the Exec CollectLimitExec has been disabled, and is disabled by default because Collect Limit replacement can be slower on the GPU, if huge number of rows in a batch it could help by limiting the number of rows transferred from GPU to CPU. Set spark.rapids.sql.exec.CollectLimitExec to true if you wish to enable it
  @Partitioning <SinglePartition$> could run on GPU
  *Exec <ProjectExec> will run on GPU
    *Expression <Alias> cast(cast(col#7 as decimal(15,-5)) as string) AS col#13 will run on GPU
      *Expression <Cast> cast(cast(col#7 as decimal(15,-5)) as string) will run on GPU
        *Expression <Cast> cast(col#7 as decimal(15,-5)) will run on GPU
    *Exec <FileSourceScanExec> will run on GPU

+----------------------+
|col                   |
+----------------------+
|1.2300000000000000E+21|
+----------------------+


scala> spark.conf.set("spark.rapids.sql.enabled", "false")

scala> spark.read.parquet("TEMP").select(col("col").cast(decType)).show(false)
java.lang.AssertionError: assertion failed:
  Decimal$DecimalIsFractional
     while compiling: <console>
        during phase: globalPhase=terminal, enteringPhase=jvm
     library version: version 2.12.15
    compiler version: version 2.12.15
  reconstructed args: -classpath /home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-24.08.0-SNAPSHOT-cuda11.jar -Yrepl-class-based -Yrepl-outdir /tmp/spark-026c13e7-fb35-46ff-a225-22ac468cd6e1/repl-47525fc2-08a4-4b4b-b850-a00c64aa2c3b

  last tree to typer: TypeTree(class Byte)
       tree position: line 6 of <console>
            tree tpe: Byte
              symbol: (final abstract) class Byte in package scala
   symbol definition: final abstract class Byte extends  (a ClassSymbol)
      symbol package: scala
       symbol owners: class Byte
           call site: constructor $eval in object $eval in package $line22

== Source file context for tree position ==

     3
     4 object $eval {
     5   lazy val $result = $line22.$read.INSTANCE.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.res5
     6   lazy val $print: _root_.java.lang.String =  {
     7     $line22.$read.INSTANCE.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw
     8
     9 ""
	at scala.reflect.internal.SymbolTable.throwAssertionError(SymbolTable.scala:185)
	at scala.reflect.internal.Symbols$Symbol.completeInfo(Symbols.scala:1525)
	at scala.reflect.internal.Symbols$Symbol.info(Symbols.scala:1514)
	at scala.reflect.internal.Symbols$Symbol.flatOwnerInfo(Symbols.scala:2353)
	at scala.reflect.internal.Symbols$ClassSymbol.companionModule0(Symbols.scala:3346)
	at scala.reflect.internal.Symbols$ClassSymbol.companionModule(Symbols.scala:3348)
	at scala.reflect.internal.Symbols$ModuleClassSymbol.sourceModule(Symbols.scala:3487)
	at scala.reflect.internal.Symbols.$anonfun$forEachRelevantSymbols$1$adapted(Symbols.scala:3802)
	at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
	at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38)
	at scala.reflect.internal.Symbols.markFlagsCompleted(Symbols.scala:3799)
	at scala.reflect.internal.Symbols.markFlagsCompleted$(Symbols.scala:3805)
	at scala.reflect.internal.SymbolTable.markFlagsCompleted(SymbolTable.scala:28)
	at scala.reflect.internal.pickling.UnPickler$Scan.finishSym$1(UnPickler.scala:324)
	at scala.reflect.internal.pickling.UnPickler$Scan.readSymbol(UnPickler.scala:342)
	at scala.reflect.internal.pickling.UnPickler$Scan.readSymbolRef(UnPickler.scala:645)
	at scala.reflect.internal.pickling.UnPickler$Scan.readType(UnPickler.scala:413)
	at scala.reflect.internal.pickling.UnPickler$Scan.$anonfun$readSymbol$10(UnPickler.scala:357)
	at scala.reflect.internal.pickling.UnPickler$Scan.at(UnPickler.scala:188)
	at scala.reflect.internal.pickling.UnPickler$Scan.readSymbol(UnPickler.scala:357)
	at scala.reflect.internal.pickling.UnPickler$Scan.$anonfun$run$1(UnPickler.scala:96)
	at scala.reflect.internal.pickling.UnPickler$Scan.run(UnPickler.scala:88)
	at scala.reflect.internal.pickling.UnPickler.unpickle(UnPickler.scala:47)
	at scala.tools.nsc.symtab.classfile.ClassfileParser.unpickleOrParseInnerClasses(ClassfileParser.scala:1186)
	at scala.tools.nsc.symtab.classfile.ClassfileParser.parseClass(ClassfileParser.scala:468)
	at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parse$2(ClassfileParser.scala:161)
	at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parse$1(ClassfileParser.scala:147)
	at scala.tools.nsc.symtab.classfile.ClassfileParser.parse(ClassfileParser.scala:130)
	at scala.tools.nsc.symtab.SymbolLoaders$ClassfileLoader.doComplete(SymbolLoaders.scala:343)
	at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.complete(SymbolLoaders.scala:250)
	at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.load(SymbolLoaders.scala:269)
	at scala.reflect.internal.Symbols$Symbol.exists(Symbols.scala:1104)
	at scala.reflect.internal.Symbols$Symbol.toOption(Symbols.scala:2609)
	at scala.tools.nsc.interpreter.IMain.translateSimpleResource(IMain.scala:340)
	at scala.tools.nsc.interpreter.IMain$TranslatingClassLoader.findAbstractFile(IMain.scala:354)
	at scala.reflect.internal.util.AbstractFileClassLoader.findResource(AbstractFileClassLoader.scala:76)
	at java.lang.ClassLoader.getResource(ClassLoader.java:1089)
	at java.lang.ClassLoader.getResourceAsStream(ClassLoader.java:1300)
	at scala.reflect.internal.util.RichClassLoader$.classAsStream$extension(ScalaClassLoader.scala:89)
	at scala.reflect.internal.util.RichClassLoader$.classBytes$extension(ScalaClassLoader.scala:81)
	at scala.reflect.internal.util.ScalaClassLoader.classBytes(ScalaClassLoader.scala:131)
	at scala.reflect.internal.util.ScalaClassLoader.classBytes$(ScalaClassLoader.scala:131)
	at scala.reflect.internal.util.AbstractFileClassLoader.classBytes(AbstractFileClassLoader.scala:41)
	at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:70)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:405)
	at org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.java:40)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at org.codehaus.janino.ClassLoaderIClassLoader.findIClass(ClassLoaderIClassLoader.java:89)
	at org.codehaus.janino.IClassLoader.loadIClass(IClassLoader.java:317)
	at org.codehaus.janino.UnitCompiler.findTypeByName(UnitCompiler.java:8618)
	at org.codehaus.janino.UnitCompiler.reclassifyName(UnitCompiler.java:8838)
	at org.codehaus.janino.UnitCompiler.reclassifyName(UnitCompiler.java:8529)
	at org.codehaus.janino.UnitCompiler.reclassify(UnitCompiler.java:8388)
	at org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:6900)
	at org.codehaus.janino.UnitCompiler.access$14600(UnitCompiler.java:226)
	at org.codehaus.janino.UnitCompiler$22$2$1.visitAmbiguousName(UnitCompiler.java:6518)
	at org.codehaus.janino.UnitCompiler$22$2$1.visitAmbiguousName(UnitCompiler.java:6515)
	at org.codehaus.janino.Java$AmbiguousName.accept(Java.java:4429)
	at org.codehaus.janino.UnitCompiler$22$2.visitLvalue(UnitCompiler.java:6515)
	at org.codehaus.janino.UnitCompiler$22$2.visitLvalue(UnitCompiler.java:6511)
	at org.codehaus.janino.Java$Lvalue.accept(Java.java:4353)
	at org.codehaus.janino.UnitCompiler$22.visitRvalue(UnitCompiler.java:6511)
	at org.codehaus.janino.UnitCompiler$22.visitRvalue(UnitCompiler.java:6490)
	at org.codehaus.janino.Java$Rvalue.accept(Java.java:4321)
	at org.codehaus.janino.UnitCompiler.getType(UnitCompiler.java:6490)
	at org.codehaus.janino.UnitCompiler.findIMethod(UnitCompiler.java:9110)
	at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:5055)
	at org.codehaus.janino.UnitCompiler.access$9100(UnitCompiler.java:226)
	at org.codehaus.janino.UnitCompiler$16.visitMethodInvocation(UnitCompiler.java:4482)
	at org.codehaus.janino.UnitCompiler$16.visitMethodInvocation(UnitCompiler.java:4455)
	at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:5286)
	at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:4455)
	at org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:5683)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2581)
	at org.codehaus.janino.UnitCompiler.access$2700(UnitCompiler.java:226)
	at org.codehaus.janino.UnitCompiler$6.visitLocalVariableDeclarationStatement(UnitCompiler.java:1506)
	at org.codehaus.janino.UnitCompiler$6.visitLocalVariableDeclarationStatement(UnitCompiler.java:1490)
	at org.codehaus.janino.Java$LocalVariableDeclarationStatement.accept(Java.java:3712)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1490)
	at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1573)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1559)
	at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:226)
	at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1496)
	at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1490)
	at org.codehaus.janino.Java$Block.accept(Java.java:2969)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1490)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2486)
	at org.codehaus.janino.UnitCompiler.access$1900(UnitCompiler.java:226)
	at org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1498)
	at org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1490)
	at org.codehaus.janino.Java$IfStatement.accept(Java.java:3140)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1490)
	at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1573)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1559)
	at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:226)
	at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1496)
	at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1490)
	at org.codehaus.janino.Java$Block.accept(Java.java:2969)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1490)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1661)
	at org.codehaus.janino.UnitCompiler.access$2000(UnitCompiler.java:226)
	at org.codehaus.janino.UnitCompiler$6.visitForStatement(UnitCompiler.java:1499)
	at org.codehaus.janino.UnitCompiler$6.visitForStatement(UnitCompiler.java:1490)
	at org.codehaus.janino.Java$ForStatement.accept(Java.java:3187)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1490)
	at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1573)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1559)
	at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:226)
	at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1496)
	at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1490)
	at org.codehaus.janino.Java$Block.accept(Java.java:2969)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1490)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1848)
	at org.codehaus.janino.UnitCompiler.access$2200(UnitCompiler.java:226)
	at org.codehaus.janino.UnitCompiler$6.visitWhileStatement(UnitCompiler.java:1501)
	at org.codehaus.janino.UnitCompiler$6.visitWhileStatement(UnitCompiler.java:1490)
	at org.codehaus.janino.Java$WhileStatement.accept(Java.java:3245)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1490)
	at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1573)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3420)
	at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1362)
	at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1335)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:807)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:975)
	at org.codehaus.janino.UnitCompiler.access$700(UnitCompiler.java:226)
	at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:392)
	at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:384)
	at org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1445)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:384)
	at org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:1312)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:833)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:410)
	at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:226)
	at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:389)
	at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:384)
	at org.codehaus.janino.Java$PackageMemberClassDeclaration.accept(Java.java:1594)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:384)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:362)
	at org.codehaus.janino.UnitCompiler.access$000(UnitCompiler.java:226)
	at org.codehaus.janino.UnitCompiler$1.visitCompilationUnit(UnitCompiler.java:336)
	at org.codehaus.janino.UnitCompiler$1.visitCompilationUnit(UnitCompiler.java:333)
	at org.codehaus.janino.Java$CompilationUnit.accept(Java.java:363)
	at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:333)
	at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:235)
	at org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:464)
	at org.codehaus.janino.ClassBodyEvaluator.compileToClass(ClassBodyEvaluator.java:314)
	at org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:237)
	at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:205)
	at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:80)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:1490)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1587)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1584)
	at org.sparkproject.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
	at org.sparkproject.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
	at org.sparkproject.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
	at org.sparkproject.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
	at org.sparkproject.guava.cache.LocalCache.get(LocalCache.java:4000)
	at org.sparkproject.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
	at org.sparkproject.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:1437)
	at org.apache.spark.sql.execution.WholeStageCodegenExec.liftedTree1$1(WholeStageCodegenExec.scala:726)
	at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:725)
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:194)
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:232)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:229)
	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:190)
	at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:340)
	at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:473)
	at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:459)
	at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:48)
	at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3868)
	at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2863)
	at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:3858)
	at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
	at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3856)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:109)
	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
	at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3856)
	at org.apache.spark.sql.Dataset.head(Dataset.scala:2863)
	at org.apache.spark.sql.Dataset.take(Dataset.scala:3084)
	at org.apache.spark.sql.Dataset.getRows(Dataset.scala:288)
	at org.apache.spark.sql.Dataset.showString(Dataset.scala:327)
	at org.apache.spark.sql.Dataset.show(Dataset.scala:810)
	at org.apache.spark.sql.Dataset.show(Dataset.scala:787)
	at $line22.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:27)
	at $line22.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:31)
	at $line22.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:33)
	at $line22.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:35)
	at $line22.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:37)
	at $line22.$read$$iw$$iw$$iw$$iw$$iw.<init>(<console>:39)
	at $line22.$read$$iw$$iw$$iw$$iw.<init>(<console>:41)
	at $line22.$read$$iw$$iw$$iw.<init>(<console>:43)
	at $line22.$read$$iw$$iw.<init>(<console>:45)
	at $line22.$read$$iw.<init>(<console>:47)
	at $line22.$read.<init>(<console>:49)
	at $line22.$read$.<init>(<console>:53)
	at $line22.$read$.<clinit>(<console>)
	at $line22.$eval$.$print$lzycompute(<console>:7)
	at $line22.$eval$.$print(<console>:6)
	at $line22.$eval.$print(<console>)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:747)
	at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1020)
	at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:568)
	at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:36)
	at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:116)
	at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41)
	at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:567)
	at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:594)
	at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:564)
	at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:865)
	at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:733)
	at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:435)
	at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:456)
	at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:239)
	at org.apache.spark.repl.Main$.doMain(Main.scala:78)
	at org.apache.spark.repl.Main$.main(Main.scala:58)
	at org.apache.spark.repl.Main.main(Main.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
error: error while loading Decimal, class file '/home/haoyangl/spark-3.3.0-bin-hadoop3.2/jars/spark-catalyst_2.12-3.3.0.jar(org/apache/spark/sql/types/Decimal.class)' is broken
(class java.lang.RuntimeException/error reading Scala signature of Decimal.class: assertion failed:
  Decimal$DecimalIsFractional
     while compiling: <console>
        during phase: globalPhase=terminal, enteringPhase=jvm
     library version: version 2.12.15
    compiler version: version 2.12.15
  reconstructed args: -classpath /home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-24.08.0-SNAPSHOT-cuda11.jar -Yrepl-class-based -Yrepl-outdir /tmp/spark-026c13e7-fb35-46ff-a225-22ac468cd6e1/repl-47525fc2-08a4-4b4b-b850-a00c64aa2c3b

  last tree to typer: TypeTree(class Byte)
       tree position: line 6 of <console>
            tree tpe: Byte
              symbol: (final abstract) class Byte in package scala
   symbol definition: final abstract class Byte extends  (a ClassSymbol)
      symbol package: scala
       symbol owners: class Byte
           call site: constructor $eval in object $eval in package $line22

== Source file context for tree position ==

     3
     4 object $eval {
     5   lazy val $result = res5
     6   lazy val $print: _root_.java.lang.String =  {
     7     $iw
     8
     9 "" )
24/05/27 10:36:56 ERROR Executor: Exception in task 0.0 in stage 5.0 (TID 5)
org.apache.spark.SparkArithmeticException: Decimal(expanded, 1.23E+21, 3, -19) cannot be represented as Decimal(15, -5). If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala:108)
	at org.apache.spark.sql.errors.QueryExecutionErrors.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:364)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:136)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
24/05/27 10:36:56 WARN TaskSetManager: Lost task 0.0 in stage 5.0 (TID 5) (spark-haoyang executor driver): org.apache.spark.SparkArithmeticException: Decimal(expanded, 1.23E+21, 3, -19) cannot be represented as Decimal(15, -5). If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala:108)
	at org.apache.spark.sql.errors.QueryExecutionErrors.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:364)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:136)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

24/05/27 10:36:56 ERROR TaskSetManager: Task 0 in stage 5.0 failed 1 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 5.0 failed 1 times, most recent failure: Lost task 0.0 in stage 5.0 (TID 5) (spark-haoyang executor driver): org.apache.spark.SparkArithmeticException: Decimal(expanded, 1.23E+21, 3, -19) cannot be represented as Decimal(15, -5). If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala:108)
	at org.apache.spark.sql.errors.QueryExecutionErrors.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:364)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:136)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

Driver stacktrace:
  at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2672)
  at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2608)
  at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2607)
  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
  at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
  at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2607)
  at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1182)
  at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1182)
  at scala.Option.foreach(Option.scala:407)
  at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1182)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2860)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2802)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2791)
  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
  at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:952)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2228)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2249)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2268)
  at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:506)
  at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:459)
  at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:48)
  at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3868)
  at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2863)
  at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:3858)
  at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
  at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3856)
  at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:109)
  at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169)
  at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95)
  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
  at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3856)
  at org.apache.spark.sql.Dataset.head(Dataset.scala:2863)
  at org.apache.spark.sql.Dataset.take(Dataset.scala:3084)
  at org.apache.spark.sql.Dataset.getRows(Dataset.scala:288)
  at org.apache.spark.sql.Dataset.showString(Dataset.scala:327)
  at org.apache.spark.sql.Dataset.show(Dataset.scala:810)
  at org.apache.spark.sql.Dataset.show(Dataset.scala:787)
  ... 49 elided
Caused by: org.apache.spark.SparkArithmeticException: Decimal(expanded, 1.23E+21, 3, -19) cannot be represented as Decimal(15, -5). If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
  at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala:108)
  at org.apache.spark.sql.errors.QueryExecutionErrors.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala)
  at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
  at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
  at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
  at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:364)
  at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
  at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
  at org.apache.spark.scheduler.Task.run(Task.scala:136)
  at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
  at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:750)

Expected behavior
Results on CPU and GPU should match.

@thirtiseven thirtiseven added bug Something isn't working ? - Needs Triage Need team to review and classify labels May 27, 2024
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants