-
Notifications
You must be signed in to change notification settings - Fork 115
Add spark session extension for Hyperspace #504
Conversation
src/main/scala/com/microsoft/hyperspace/HyperspaceSparkSessionExtension.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/microsoft/hyperspace/HyperspaceSparkSessionExtension.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/microsoft/hyperspace/HyperspaceSparkSessionExtension.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/microsoft/hyperspace/index/execution/BucketUnionStrategy.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/microsoft/hyperspace/index/rules/ApplyHyperspace.scala
Outdated
Show resolved
Hide resolved
c62f5b6
to
ab14466
Compare
src/main/scala/com/microsoft/hyperspace/index/IndexConstants.scala
Outdated
Show resolved
Hide resolved
923872a
to
84b2969
Compare
0cd2a5b
to
c0d439f
Compare
c0d439f
to
35d4ffe
Compare
src/main/scala/com/microsoft/hyperspace/HyperspaceSparkSessionExtension.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/microsoft/hyperspace/index/IndexConstants.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/microsoft/hyperspace/HyperspaceSparkSessionExtension.scala
Outdated
Show resolved
Hide resolved
src/test/scala/com/microsoft/hyperspace/HyperspaceExtensionTest.scala
Outdated
Show resolved
Hide resolved
82820db
to
daf9d08
Compare
src/main/scala/com/microsoft/hyperspace/HyperspaceSparkSessionExtension.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/microsoft/hyperspace/HyperspaceSparkSessionExtension.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/microsoft/hyperspace/HyperspaceSparkSessionExtension.scala
Show resolved
Hide resolved
add configure to control enabling hyperspace add dummy rule to avoid different behavior of Extensions / apply hyperspace add test for hyperspace extension
89cdb02
to
fa80dba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/main/scala/com/microsoft/hyperspace/HyperspaceSparkSessionExtension.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/microsoft/hyperspace/HyperspaceSparkSessionExtension.scala
Outdated
Show resolved
Hide resolved
override def apply(extensions: SparkSessionExtensions): Unit = { | ||
extensions.injectOptimizerRule { sparkSession => | ||
// Enable Hyperspace to leverage indexes. | ||
sparkSession.enableHyperspace() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the new model, enableHyperspace
only exists for backward compatibility. I think it's better to factor out the rule insertion code out of this method and invoke the method here and from enableHyperspace
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clee704 Do you mean this part?
if (!sparkSession.sessionState.experimentalMethods.extraOptimizations.contains(
ApplyHyperspace)) {
sparkSession.sessionState.experimentalMethods.extraOptimizations ++=
ApplyHyperspace :: Nil
}
if (!sparkSession.sessionState.experimentalMethods.extraStrategies.contains(
BucketUnionStrategy)) {
sparkSession.sessionState.experimentalMethods.extraStrategies ++=
BucketUnionStrategy :: Nil
}
Where should I put this code because package object hyperspace
is quite customer side interfaces, so not sure if it is ok to create a function like
package object hyperspace {
/**
* Hyperspace-specific implicit class on SparkSession.
*/
implicit class Implicits(sparkSession: SparkSession) {
def enableHyperspace(): SparkSession = {
HyperspaceConf.setHyperspaceApplyEnabled(sparkSession, true)
addOptimizationsIfNeeded()
sparkSession
}
private def addOptimizationsIfNeeded(): Unit = {
if (!sparkSession.sessionState.experimentalMethods.extraOptimizations.contains(
ApplyHyperspace)) {
sparkSession.sessionState.experimentalMethods.extraOptimizations ++=
ApplyHyperspace :: Nil
}
if (!sparkSession.sessionState.experimentalMethods.extraStrategies.contains(
BucketUnionStrategy)) {
sparkSession.sessionState.experimentalMethods.extraStrategies ++=
BucketUnionStrategy :: Nil
}
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can put addOptimizationsIfNeeded()
in a companion object HyperspaceSparkSessionExtension
and call the method from here and enableHyperspace()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clee704 can you check whether I implemented as you expected?
src/main/scala/com/microsoft/hyperspace/index/rules/ApplyHyperspace.scala
Outdated
Show resolved
Hide resolved
* | ||
* @param sparkSession Spark session that will use Hyperspace | ||
*/ | ||
def addOptimizationsIfNeeded(sparkSession: SparkSession): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to locate the function in package.scala
What is the context for this pull request?
What changes were proposed in this pull request?
Added a feature to enable Hyperspace with SparkSessionExtention
Does this PR introduce any user-facing change?
Yes. Users now can enable Hyperspace as follows
or
How was this patch tested?
Manually with spark-shell. If an automated test is required, I will add it.