Refactor Stage info code between Q/P tools #971

amahussein · 2024-04-29T15:57:25Z

Signed-off-by: Ahmed Hussein (amahussein) [email protected]

Contributes to #980

This code change aims at bringing the Q/P tools handling of stages and their accumulator to a common ground.
There is a couple of fixes done in this code change including:

Capture accumulator IDs of a stage during a stage completion event.
Fix the construction of MLFunctions
Fix the implementation of jobAndStageMetricsAggregation which was not efficient in iterating multiple times of the tasks list.
Remove redundant Data structure that maps between accumulators and stages.

Changes

No output format changes from this PR.
Fixed the handling of StageCompleted/StageSubmitted events to capture the accumulator IDs
- this turned out to be an old bug that results in missing many of the accumulators defined for a given stage.
- By making that fix, more execs are now defined within their stages.
- The truncated eventlog UT has to be fixed as a result of that fix.
Each of the Profiling and Qualification tools had their own map between accumulators to stages. This has been refactored to use the same data structure.
Created a new class "StageManager" that handles all the logic related to storing and scanning stages.
- StageManager holds a nested hashMap to keep track of the stages that appear in stageCompleted/stageSubmitted.
- It also holds the map between accumulatorIds to StageIDs (it has been moved from the AppBase)
Refactored the implementation of Analysis.jobAndStageMetricsAggregation() because it was unecessarily iterating several times on all the tasks. The new implementationiterates only once on the tasks, then uses the cached values for each stage to calculate the job aggregated metrics.
Added static annotations to improve the readability of the code.
- Calculated: To distinguish between fields that are loaded from Spark Vs the ones that are calculated by the Tools.
- WallClock: to be used to distinguish fields that represent wallClock Vs DF durations (aggregations on tasks)
- Since: Similar to Spark to make it easy to know when a specific logic was added to the tools.
Fixed a bug in the constructor of MLFunctions. The constructor mistakenly used the stageId instead of appID.

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]> Contributes to NVIDIA#477 This code change aims at bringing the Q/P tools handling of stages and their accumulator to a common ground. There is a couple of fixes done in this code change including: - Capture accumulator IDs of a stage during a stage completion event. - Fix the construction of MLFunctions - Fix the implementation of `jobAndStageMetricsAggregation` which was not efficient in iterating multiple times of the tasks list. - Remove redundant Data structure that maps between accumulators and stages.

amahussein · 2024-04-29T16:08:01Z

core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/GenerateDot.scala

@@ -88,7 +88,7 @@ object GenerateDot {
    val accumSummary = accums.map { a =>
      Seq(a.sqlID, a.accumulatorId, a.total)
    }
-    val accumIdToStageId = app.accumIdToStageId
+    val accumIdToStageId = app.stageManager.reduceAccumMapping()


This is a hack to get the generateDot to work with the 1-to-M map.
GenerateDot is rarely used and fixing the implementation to be 1-to-M is going to bloat the PR

amahussein · 2024-04-29T16:08:27Z

core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/CompareApplications.scala

@@ -32,7 +32,8 @@ class CompareApplications(apps: Seq[ApplicationInfo]) extends Logging {
  def findMatchingStages(): (Seq[CompareProfileResults], Seq[CompareProfileResults]) = {
    val normalizedByAppId = apps.map { app =>
      val normalized = app.sqlPlans.mapValues { plan =>
-        SparkPlanInfoWithStage(plan, app.accumIdToStageId).normalizeForStageComparison
+        SparkPlanInfoWithStage(plan,
+          app.stageManager.reduceAccumMapping()).normalizeForStageComparison


reduceAccumMapping() is a hack to get the generateDot to work with the 1-to-M map.
GenerateDot is rarely used and fixing the implementation to be 1-to-M is going to bloat the PR

amahussein · 2024-04-29T16:11:39Z

core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/Analysis.scala

-            tasksInJob.map(_.sw_recordsWritten).sum,
-            tasksInJob.map(_.sw_writeTime).sum
-          ))
+    // first get all stage aggregated levels


Old code used to iterate on all jobs to get the stages, then iterate on all tasks within each stage to aggregate. This will create all the jobs rows.
Then it will do the same sequence to get all the stage rows.
This is clearly very time consuming.
Instead the new code does the following:

Loop on all the stages and aggreate them.

Cache the results in a hashMap.

Loop on all the jobs and aggregate the values cached within the hashMap.

amahussein · 2024-04-29T16:13:44Z

core/src/test/resources/QualificationExpectations/truncated_1_end_expectation.csv

@@ -1,2 +1,2 @@
 App Name,App ID,Recommendation,Estimated GPU Speedup,Estimated GPU Duration,Estimated GPU Time Saved,SQL DF Duration,SQL Dataframe Task Duration,App Duration,GPU Opportunity,Executor CPU Time Percent,SQL Ids with Failures,Unsupported Read File Formats and Types,Unsupported Write Data Format,Complex Types,Nested Complex Types,Potential Problems,Longest SQL Duration,NONSQL Task Duration Plus Overhead,Unsupported Task Duration,Supported SQL DF Task Duration,Task Speedup Factor,App Duration Estimated,Unsupported Execs,Unsupported Expressions,Estimated Job Frequency (monthly)
-"Rapids Spark Profiling Tool Unit Tests","local-1622043423018","Not Recommended",1.08,4479.65,392.34,1306,14353,4872,558,62.67,"","","JSON","","","",1306,4477,8214,6139,3.36,true,"SerializeFromObject;Execute InsertIntoHadoopFsRelationCommand json;DeserializeToObject;Filter;MapElements;Scan","",30
+"Rapids Spark Profiling Tool Unit Tests","local-1622043423018","Not Recommended",1.06,4564.93,307.06,1306,14353,4872,472,62.67,"","","JSON","","","",1306,4477,9164,5189,2.86,true,"SerializeFromObject;Execute InsertIntoHadoopFsRelationCommand json;DeserializeToObject;Filter;MapElements;Scan","",30


I compared manually the output of the qualification before and after.
This PR fixed a bug resulting in linking more Execs to their Stage. That caused the transitions and the supported duration to be different compared to the previous code.

amahussein · 2024-04-29T16:14:24Z

core/src/main/scala/org/apache/spark/sql/rapids/tool/profiling/ApplicationInfo.scala

@@ -208,7 +208,6 @@ class ApplicationInfo(
  var allSQLMetrics: ArrayBuffer[SQLMetricInfoCase] = ArrayBuffer[SQLMetricInfoCase]()
  var sqlPlanMetricsAdaptive: ArrayBuffer[SQLPlanMetricsCase] = ArrayBuffer[SQLPlanMetricsCase]()

-  val accumIdToStageId: mutable.HashMap[Long, Int] = new mutable.HashMap[Long, Int]()


Redundant hashMap because appBase already defined such 1-toM

amahussein · 2024-04-29T16:15:14Z

core/src/main/scala/org/apache/spark/sql/rapids/tool/EventProcessorBase.scala

-      val existingStages = app.accumulatorToStages.getOrElse(accumId, Set.empty)
-      app.accumulatorToStages.put(accumId, existingStages + event.stageInfo.stageId)
-    }
+    app.getOrCreateStage(event.stageInfo)


Encapsulate the initialization and teh claculation of duration within the retrieval of the stage object.

amahussein · 2024-04-29T16:15:46Z

core/src/main/scala/org/apache/spark/sql/rapids/tool/AppBase.scala

@@ -226,8 +225,8 @@ abstract class AppBase(
    }

    if (mlOps.nonEmpty) {
-      Some(MLFunctions(Some(appId.toString), stageInfo.info.stageId, mlOps,
-        stageInfo.duration.getOrElse(0)))
+      Some(MLFunctions(appId, stageModel.sId, mlOps,


this was a bug that the stageId is used instead of the AppID

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

core/src/main/scala/org/apache/spark/sql/rapids/tool/store/StageModel.scala

tgravescs · 2024-04-30T12:50:29Z

core/src/main/scala/org/apache/spark/sql/rapids/tool/annotation/Calculated.scala

+ * Spark-information.
+ */
+@param @field @getter @setter @beanGetter @beanSetter
+class Calculated(desc: String = "") extends scala.annotation.Annotation


I have my reservations on how useful this is going to be. Or atleast how useful it should be. Ideally most things read from Spark are named appropriately that they should be fairly obvious. Otherwise everything is calculated so you would just end up with annotations everywhere.

I agree. My thought is that this annotation could help at least temporarily as we won't have to rename the fields everywhere.
I had some hard time in understanding where each field comes from and whether we need to revisit how it is deduced.

tgravescs · 2024-04-30T12:54:21Z

core/src/main/scala/org/apache/spark/sql/rapids/tool/EventProcessorBase.scala

-    stage.duration = ProfileUtils.optionLongMinusOptionLong(stage.completionTime,
-      stage.info.submissionTime)
-    val stageAccumulatorIds = event.stageInfo.accumulables.values.map { m => m.id }.toSeq
-    stageAccumulatorIds.foreach { accumId =>


I'm a bit confused here as I thought part of the change here was to fix the tracking of accumulators on stage completd, this looks like we were? Were we just not using them or was it the taskend one we weren't adding?

We were not adding them in the taskEnd.
The code is removed from EventProcessorBase and moved to the StageManager during the creating of the StageModel instance.

stage.completionTime and stage.failureReason are redundant because we were capturing Spark's stageInfo. In the new version, we will read them from Spark's StageInfo

core/src/main/scala/org/apache/spark/sql/rapids/tool/store/StageModel.scala

tgravescs · 2024-04-30T13:14:35Z

core/src/main/scala/org/apache/spark/sql/rapids/tool/store/StageModel.scala

+  var duration: Option[Long] = None
+
+  // Whenever an event is triggered, the object should update the Stage info.
+  private def updatedInfo(newSInfo: StageInfo): Unit = {


function should be updateInfo (without d) if its actively doing something

core/src/main/scala/org/apache/spark/sql/rapids/tool/store/StageModel.scala

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

core/src/main/scala/org/apache/spark/sql/rapids/tool/store/StageModelManager.scala

tgravescs · 2024-04-30T18:19:56Z

core/src/main/scala/org/apache/spark/sql/rapids/tool/store/StageModel.scala

+  // We keep track of the attemptId to allow improvement down the road if we decide to handle
+  // different Attempts.
+  // - 1st level maps between [Int: StageId -> 2nd Level]
+  // - 2nd level maps between [Int:, StageModel]


missing attemptId in the desription

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

nartal1

Thanks @amahussein !

amahussein added bug Something isn't working core_tools Scope the core module (scala) labels Apr 29, 2024

amahussein self-assigned this Apr 29, 2024

amahussein mentioned this pull request Apr 29, 2024

[FEA] Qualification tool should look at spill metrics #477

Closed

amahussein commented Apr 29, 2024

View reviewed changes

Remove unused class StageInfoClass

221b46c

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

amahussein requested review from nartal1 and tgravescs April 29, 2024 16:23

tgravescs reviewed Apr 30, 2024

View reviewed changes

amahussein added 2 commits April 30, 2024 11:05

Address review comments

5e690a9

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

Move StageModelManager to a separate scala class file

674af86

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

amahussein requested a review from tgravescs April 30, 2024 17:34

amahussein mentioned this pull request Apr 30, 2024

Refactor core to converge Qualification and Profiling Tools implementation #980

Closed

parthosa self-requested a review April 30, 2024 18:04

nartal1 reviewed Apr 30, 2024

View reviewed changes

tgravescs reviewed Apr 30, 2024

View reviewed changes

Fix typos

85b7d3f

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

amahussein requested review from nartal1 and tgravescs April 30, 2024 18:56

nartal1 approved these changes Apr 30, 2024

View reviewed changes

amahussein merged commit e32a85a into NVIDIA:dev Apr 30, 2024
15 checks passed

amahussein deleted the spark-rapids-tools-477-qual-capture-acccums branch April 30, 2024 21:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Stage info code between Q/P tools #971

Refactor Stage info code between Q/P tools #971

amahussein commented Apr 29, 2024 •

edited

Loading

amahussein Apr 29, 2024

amahussein Apr 29, 2024

amahussein Apr 29, 2024

amahussein Apr 29, 2024

amahussein Apr 29, 2024

amahussein Apr 29, 2024

amahussein Apr 29, 2024

tgravescs Apr 30, 2024

amahussein Apr 30, 2024

tgravescs Apr 30, 2024

amahussein Apr 30, 2024

tgravescs Apr 30, 2024

amahussein Apr 30, 2024

tgravescs Apr 30, 2024

amahussein Apr 30, 2024

nartal1 left a comment

Refactor Stage info code between Q/P tools #971

Refactor Stage info code between Q/P tools #971

Conversation

amahussein commented Apr 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nartal1 left a comment

Choose a reason for hiding this comment

amahussein commented Apr 29, 2024 •

edited

Loading