have registering a metrics source handle newly added metrics #53

bavardage · 2016-11-14T17:52:10Z

current behaviour:
spark only sees the metrics that are in the metrics registry of your source at the time of registration

possible fix:
use a metrics listener to do the registration

* Introduce blocking submit to kubernetes by default Two new configuration settings: - spark.kubernetes.submit.waitAppCompletion - spark.kubernetes.report.interval * Minor touchups * More succinct logging for pod state * Fix import order * Switch to watch-based logging * Spaces in comma-joined volumes, labels, and containers * Use CountDownLatch instead of SettableFuture * Match parallel ConfigBuilder style * Disable logging in fire-and-forget mode Which is enabled with spark.kubernetes.submit.waitAppCompletion=false (default: true) * Additional log line for when application is launched * Minor wording changes * More logging * Drop log to DEBUG

robert3005 · 2017-06-22T16:25:32Z

@bavardage This is already handled via Source trait. Then in metrics.properties you can either provide a class to enable or you can call SparkEnv.metricsSystem.registerSource. Also made #214 which will let you use SharedMetricRegistries

robert3005 · 2017-06-30T15:14:41Z

Fixed in #214

…in optimizations  ### What changes were proposed in this pull request?  This is a followup of apache#26434 This PR use one special shuffle reader for skew join, so that we only have one join after optimization. In order to do that, this PR 1. add a very general `CustomShuffledRowRDD` which support all kind of partition arrangement. 2. move the logic of coalescing shuffle partitions to a util function, and call it during skew join optimization, to totally decouple with the `ReduceNumShufflePartitions` rule. It's too complicated to interfere skew join with `ReduceNumShufflePartitions`, as you need to consider the size of split partitions which don't respect target size already. ### Why are the changes needed?  The current skew join optimization has a serious performance issue: the size of the query plan depends on the number and size of skewed partitions. ### Does this PR introduce any user-facing change?  no ### How was this patch tested?  existing tests test UI manually: ![image](https://user-images.githubusercontent.com/3182036/74357390-cfb30480-4dfa-11ea-83f6-825d1b9379ca.png) explain output ``` AdaptiveSparkPlan(isFinalPlan=true) +- OverwriteByExpression org.apache.spark.sql.execution.datasources.noop.NoopTable$403a2ed5, [AlwaysTrue()], org.apache.spark.sql.util.CaseInsensitiveStringMap1f +- *(5) SortMergeJoin(skew=true) [key1#2L], [key2#6L], Inner :- *(3) Sort [key1#2L ASC NULLS FIRST], false, 0 : +- SkewJoinShuffleReader 2 skewed partitions with size(max=5 KB, min=5 KB, avg=5 KB) : +- ShuffleQueryStage 0 : +- Exchange hashpartitioning(key1#2L, 200), true, [id=palantir#53] : +- *(1) Project [(id#0L % 2) AS key1#2L] : +- *(1) Filter isnotnull((id#0L % 2)) : +- *(1) Range (0, 100000, step=1, splits=6) +- *(4) Sort [key2#6L ASC NULLS FIRST], false, 0 +- SkewJoinShuffleReader 2 skewed partitions with size(max=5 KB, min=5 KB, avg=5 KB) +- ShuffleQueryStage 1 +- Exchange hashpartitioning(key2#6L, 200), true, [id=palantir#64] +- *(2) Project [((id#4L % 2) + 1) AS key2#6L] +- *(2) Filter isnotnull(((id#4L % 2) + 1)) +- *(2) Range (0, 100000, step=1, splits=6) ``` Closes apache#27493 from cloud-fan/aqe. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: herman <[email protected]> (cherry picked from commit a4ceea6) Signed-off-by: herman <[email protected]>

robert3005 closed this as completed Jun 30, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

have registering a metrics source handle newly added metrics #53

have registering a metrics source handle newly added metrics #53

bavardage commented Nov 14, 2016 •

edited

Loading

robert3005 commented Jun 22, 2017

robert3005 commented Jun 30, 2017

have registering a metrics source handle newly added metrics #53

have registering a metrics source handle newly added metrics #53

Comments

bavardage commented Nov 14, 2016 • edited Loading

robert3005 commented Jun 22, 2017

robert3005 commented Jun 30, 2017

bavardage commented Nov 14, 2016 •

edited

Loading