Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-35384][SQL] Improve performance for InvokeLike.invoke #32527

Closed
wants to merge 2 commits into from

Conversation

sunchao
Copy link
Member

@sunchao sunchao commented May 12, 2021

What changes were proposed in this pull request?

Change map in InvokeLike.invoke to a while loop to improve performance, following Spark style guide.

Why are the changes needed?

InvokeLike.invoke, which is used in non-codegen path for Invoke and StaticInvoke, currently uses map to evaluate arguments:

val args = arguments.map(e => e.eval(input).asInstanceOf[Object])
if (needNullCheck && args.exists(_ == null)) {
  // return null if one of arguments is null
  null
} else { 
  ...

which is pretty expensive if the method itself is trivial. We can change it to a plain while loop.

Screen Shot 2021-05-12 at 12 19 59 AM

Benchmark results show this can improve as much as 3x from V2FunctionBenchmark:

Before

 OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
 Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
 scalar function (long + long) -> long, result_nullable = false codegen = false:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 --------------------------------------------------------------------------------------------------------------------------------------------------------------
 native_long_add                                                                         36506          36656         251         13.7          73.0       1.0X
 java_long_add_default                                                                   47151          47540         370         10.6          94.3       0.8X
 java_long_add_magic                                                                    178691         182457        1327          2.8         357.4       0.2X
 java_long_add_static_magic                                                             177151         178258        1151          2.8         354.3       0.2X

After

 OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
 Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
 scalar function (long + long) -> long, result_nullable = false codegen = false:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 --------------------------------------------------------------------------------------------------------------------------------------------------------------
 native_long_add                                                                         29897          30342         568         16.7          59.8       1.0X
 java_long_add_default                                                                   40628          41075         664         12.3          81.3       0.7X
 java_long_add_magic                                                                     54553          54755         182          9.2         109.1       0.5X
 java_long_add_static_magic                                                              55410          55532         127          9.0         110.8       0.5X

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing tests.

@github-actions github-actions bot added the SQL label May 12, 2021
@SparkQA
Copy link

SparkQA commented May 13, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42996/

@SparkQA
Copy link

SparkQA commented May 13, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42996/

@sunchao
Copy link
Member Author

sunchao commented May 13, 2021

Copy link
Member

@maropu maropu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine otherwise.

var i = 0
val len = arguments.length
while (i < len) {
val e = arguments(i)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks we don't need this intermediate val?

evaluatedArgs(i) = arguments(i).eval(input).asInstanceOf[Object]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea let me remove it

evaluatedArgs(i) = e.eval(input).asInstanceOf[Object]
i += 1
}
if (needNullCheck && evaluatedArgs.exists(_ == null)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: my IDE suggests .exists(_ == null) -> .contains(null)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the exists part is not related to this PR but I'm happy to change it :)

var i = 0
val len = arguments.length
while (i < len) {
evaluatedArgs(i) = arguments(i).eval(input).asInstanceOf[Object]
Copy link
Member

@viirya viirya May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need to keep evaluatedArgs as a (lazy) val?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean just use val?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't we evaluate arguments for each time invoke is called? Why not just having val evaluatedArgs: Array[Object] = new Array[Object](arguments.length) in invoke?

Copy link
Member

@dongjoon-hyun dongjoon-hyun May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it aims to reuse Array[Object] itself and only changes the values of array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea even though we evaluate arguments for each invoke call we can reuse the same array to store the results of evaluation. I guess it's better than allocating a new Array[Object] for each input row.

@SparkQA
Copy link

SparkQA commented May 13, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43000/

@SparkQA
Copy link

SparkQA commented May 13, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43000/

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. The improvement looks nice!

@SparkQA
Copy link

SparkQA commented May 13, 2021

Test build #138475 has finished for PR 32527 at commit 9ce2542.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Thank you, @sunchao and all! Merged to master for Apache Spark 3.2.0.

// return null if one of arguments is null
null
} else {
val ret = try {
method.invoke(obj, args: _*)
method.invoke(obj, evaluatedArgs: _*)
} catch {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also improve the last piece?

      val boxedClass = ScalaReflection.typeBoxedJavaMapping.get(dataType)
      if (boxedClass.isDefined) {
        boxedClass.get.cast(ret)
      } else {
        ret
      }

We can create a function for it

private lazy val boxing: Any => Any = ScalaReflection.typeBoxedJavaMapping.get(dataType).map(_.cast(_)).getOrElse(identity)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do the similar thing in Invoke.eval

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea let me try it. In the profiling after this PR, HashMap.get takes 7.82% from the entire invoke call so it seems worthwhile to do this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we can do the similar thing in Invoke.eval though since obj in obj.getClass.getMethod(functionName, argClasses: _*) is different for each call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Another idea: obj from InternalRow are always of the same class, we can avoid this

@transient lazy val method = {
  val cls = targetObject.dataType match {
    case ObjectType(cls) => cls
    case StringType => classOf[UTF8String]
    case _: DecimalType => classOf[Decimal]
    ...
  }
  findMethod(cls, encodedFunctionName, argClasses)
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'm not sure. Looking at usages of Invoke, it seems targetObject.dataType is usually ObjectType (for instance, in ScalarFunction we wrap the UDF into a Literal with ObjectType), so curious how useful this would be and when we'd use StringType/DecimalType for the targetObject.

Looking at the profiling result for Invoke.eval, it is now dominated by InvokeLike.invoke:

Screen Shot 2021-05-13 at 9 44 19 AM

Although this is somewhat unrelated to the above as V2FunctionBenchmark (and ScalarFunction) uses ObjectType for Invoke so it's already handled by the current code:

  @transient lazy val method = targetObject.dataType match {
    case ObjectType(cls) =>
      Some(findMethod(cls, encodedFunctionName, argClasses))
    case _ => None
  }

we may need new benchmarks if we decide to do this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, for UDF, it's just an extra method.isDefine check, and probably not a big issue.

@SparkQA
Copy link

SparkQA commented May 13, 2021

Test build #138480 has finished for PR 32527 at commit 2831f9c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants