Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-2208] fix zero shuffle wait time in fast machine #3380

Closed
wants to merge 1 commit into from

Conversation

XuefengWu
Copy link

No description provided.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@XuefengWu XuefengWu changed the title [SPARK-2208] fix zero shuffle wait time in fast machine by have a sleep when read shuffle data for hack blockManager [SPARK-2208] fix zero shuffle wait time in fast machine Nov 20, 2014
@rxin
Copy link
Contributor

rxin commented Nov 21, 2014

cc @aarondav

@aarondav
Copy link
Contributor

I am not familiar with this issue or fix. Could you describe in a bit more detail what the intended solution is?

@XuefengWu
Copy link
Author

this unit test is try to verify the local metrics, and the shuffleReadMetrics measure time fetch data.
It dependence on number of partitions and read time.
But In my desktop, when the number of partitions is the some as cores, the shuffle read time alway be zero. So I hacked the blockManager, it always sleep 1 millis, to make sure shuffleReadMetrics fetchWaitTime always more than zero.

@XuefengWu
Copy link
Author

@aarondav any more suggestion ?

@@ -19,12 +19,19 @@ package org.apache.spark.scheduler

import java.util.concurrent.Semaphore

import akka.actor.ActorSystem
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: import ordering should abide by the style guide

@aarondav
Copy link
Contributor

aarondav commented Dec 1, 2014

Actually, I realized it might simplify this patch a lot if instead of extending BlockManager, we extend SortShuffleManager which returns a ShuffleBlockManager that injects a wait into getBlockData.

This way you could just do

val myConf = conf.duplicate
  .set("spark.shuffle.manager", clasOf[SlowShuffleManager].getName)
sc = new SparkContext(myConf)

rather than doing all this work creating SparkEnvs and such, which will be annoying any time we add a new parameter to BlockManager or SparkEnv.

Note that I would also refactor this whole suite to construct a single conf once:
val conf = new SparkConf().setMaster("local").setAppName("SparkListenerSuite")

and in each test to just use
sc = new SparkContext(conf)

(rather than in before)

This would allow constructing special SparkContexts without having to have 2 simultaneously like you do with sc2.

@XuefengWu
Copy link
Author

@aarondav , thanks point out the style issue, and I think set SlowShuffleManager is a better idea too. thanks.

@XuefengWu XuefengWu closed this Dec 1, 2014
asfgit pushed a commit that referenced this pull request Mar 24, 2016
## What changes were proposed in this pull request?

A fix for local metrics tests that can fail on fast machines.
This is probably what is suggested here #3380 by aarondav?

## How was this patch tested?

CI Tests

Cheers

Author: Joan <[email protected]>

Closes #11747 from joan38/SPARK-2208-Local-metrics-tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants