Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for self-contained profiling #10870

Merged
merged 5 commits into from
May 29, 2024
Merged

Conversation

jlowe
Copy link
Member

@jlowe jlowe commented May 22, 2024

Contributes to #10632. Depends on NVIDIA/spark-rapids-jni#2066.

Adds the ability for the RAPIDS Accelerator to collect CUDA and NVTX profiling data with no extra requirements on the execution environment (i.e.: do not need to install Nsight Systems or the CUDA toolkit). Profiling is enabled by setting spark.rapids.profile.pathPrefix to a URI prefix where profile data will be written (e.g.: an HDFS or other distributed filesystem or object store path). Profiling data is written as ZSTD-compressed data by default, and once downloaded and decompressed, can be converted into other formats via the spark_rapids_profile_converter tool provided by NVIDIA/spark-rapids-jni#2066.

By default only executor 0 is profiled from startup until shutdown, but there are a number of configs to control which executors are profiled and what portions of the executor lifespan are profiled. Time ranges (in seconds since startup) or job- and stage-specific ranges are supported, where the profiler will be started and stopped accordingly.

@jlowe jlowe self-assigned this May 22, 2024
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern is really about testing. I know that this feature is not done, and it is not ready for customers yet. But I also am a bit concerned about code rot. Are there any plans to have some kind of regular testing to be sure that this still works?

stageRanges = new RangeConfMatcher(conf, RapidsConf.PROFILE_STAGES)
driverPollMillis = conf.profileDriverPollMillis
if (timeRanges.isDefined && (stageRanges.nonEmpty || jobRanges.nonEmpty)) {
throw new UnsupportedOperationException(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the job crash if this happens? Would it be better to just have a priority on which one to use, and then WARN the user if it happens/possibly add a warning to the file being written? Especially when other places will eat exceptions from the timer and just log them. But I understand why that might be happening, so I can see it either way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it crashes the job on startup. I wanted it to fail fast rather than silently ignore an explicit request by the user that may take a long time and ultimately not capture the type of profile they want.

private var isProfileActive = false

def init(pluginCtx: PluginContext, conf: RapidsConf): Unit = {
require(writer.isEmpty, "Already initialized")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want lots of warning if profiling is enabled? I am a bit concerned about someone leaving this on in production on accident.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a warning logged on the driver when profiling is started for each executor that is profiling so the user can easily locate the profiles. I'll also add a warning log on the executor, both to indicate profiling is enabled if all we have is the executor log and also as another pointer to the profile output path.

private def openOutput(codec: Option[CompressionCodec]): WritableByteChannel = {
val hadoopConf = pluginCtx.ask(ProfileInitMsg(executorId, outPath.toString))
.asInstanceOf[SerializableConfiguration].value
val fs = outPath.getFileSystem(hadoopConf)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have some kind of a follow on issue to make this a more generic service/API? We have a number of other debug tools that want to write data out. It would be nice if we were consistent in how we got the configs to do so, and it would be even nicer if we didn't have to worry about it for each debug operation being done.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #10892.

revans2
revans2 previously approved these changes May 23, 2024
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry forgot to hit the approve button on the review. My comments are all nits and would not block this from going in.

@jlowe
Copy link
Member Author

jlowe commented May 24, 2024

Filed #10893 to track adding automated testing.


import com.nvidia.spark.rapids.jni.Profiler
import org.apache.hadoop.fs.Path
import org.apache.hadoop.ipc.CallerContext
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use other alternative thanCallerContext? It was introduced since Hadoop 2.8 and Hadoop 3.0. It may fail some cases for older Hadoop version users.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know of an alternative to retrieve the job ID associated with a task because it's not in the TaskContext. Note that we're not using CallerContext unless job ranges are specified, so it shouldn't be an issue unless job-level profiling is requested. Spark bundles the Hadoop client jars, so this will only be an issue if Spark was built with the hadoop-2 profile which compiles against Hadoop 2.7.4 by default.

I'll change this to something that is more tolerant of CallerContext not being there and fail fast if job ranges are requested when it's not available. If you know of a better way to get the job ID of a task on the executor, I'm happy to explore it.

@sameerz sameerz added the tools label May 28, 2024
@jlowe
Copy link
Member Author

jlowe commented May 28, 2024

build

@winningsix winningsix merged commit def596c into NVIDIA:branch-24.06 May 29, 2024
44 checks passed
@jlowe jlowe deleted the profiler branch May 29, 2024 14:52
sperlingxx pushed a commit to sperlingxx/spark-rapids that referenced this pull request May 30, 2024
* Add support for self-contained profiling

Signed-off-by: Jason Lowe <[email protected]>

* Use Scala regex, add executor-side logging on profile startup/shutdown

* Use reflection to handle potentially missing Hadoop CallerContext

* scala 2.13 fix

---------

Signed-off-by: Jason Lowe <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants