Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SKIPME merging Apache branch-1.4 bug fixes #65

Merged
merged 6 commits into from
Jul 17, 2015

Commits on Jul 13, 2015

  1. [SPARK-8743] [STREAMING] Deregister Codahale metrics for streaming wh…

    …en StreamingContext is closed
    
    The issue link: https://issues.apache.org/jira/browse/SPARK-8743
    Deregister Codahale metrics for streaming when StreamingContext is closed
    
    Design:
    Adding the method calls in the appropriate start() and stop () methods for the StreamingContext
    
    Actions in the PullRequest:
    1) Added the registerSource method call to the start method for the Streaming Context.
    2) Added the removeSource method to the stop method.
    3) Added comments for both 1 and 2 and comment to show initialization of the StreamingSource
    4) Added a test case to check for both registration and de-registration of metrics
    
    Previous closed PR for reference: apache#7250
    
    Author: Neelesh Srinivas Salian <[email protected]>
    
    Closes apache#7362 from nssalian/branch-SPARK-8743 and squashes the following commits:
    
    7d998a3 [Neelesh Srinivas Salian] Removed the Thread.sleep() call
    8b26397 [Neelesh Srinivas Salian] Moved the scalatest.{} import
    0e8007a [Neelesh Srinivas Salian] moved import org.apache.spark{} to correct place
    daedaa5 [Neelesh Srinivas Salian] Corrected Ordering of imports
    8873180 [Neelesh Srinivas Salian] Removed redundancy in imports
    59227a4 [Neelesh Srinivas Salian] Changed the ordering of the imports to classify  scala and spark imports
    d8cb577 [Neelesh Srinivas Salian] Added registerSource to start() and removeSource to stop(). Wrote a test to check the registration and de-registration
    
    (cherry picked from commit b7bcbe2)
    Signed-off-by: Tathagata Das <[email protected]>
    Neelesh Srinivas Salian authored and tdas committed Jul 13, 2015
    Configuration menu
    Copy the full SHA
    50607ec View commit details
    Browse the repository at this point in the history

Commits on Jul 14, 2015

  1. [SPARK-9010] [DOCUMENTATION] Improve the Spark Configuration document…

    … about `spark.kryoserializer.buffer`
    
    The meaning of spark.kryoserializer.buffer should be "Initial size of Kryo's serialization buffer. Note that there will be one buffer per core on each worker. This buffer will grow up to spark.kryoserializer.buffer.max if needed.".
    
    The spark.kryoserializer.buffer.max.mb is out-of-date in spark 1.4.
    
    Author: zhaishidan <[email protected]>
    
    Closes apache#7393 from stanzhai/master and squashes the following commits:
    
    69729ef [zhaishidan] fix document error about spark.kryoserializer.buffer.max.mb
    
    (cherry picked from commit c1feebd)
    Signed-off-by: Sean Owen <[email protected]>
    stanzhai authored and srowen committed Jul 14, 2015
    Configuration menu
    Copy the full SHA
    dce68ad View commit details
    Browse the repository at this point in the history

Commits on Jul 15, 2015

  1. [SPARK-9012] [WEBUI] Escape Accumulators in the task table

    If running the following codes, the task table will be broken because accumulators aren't escaped.
    ```
    val a = sc.accumulator(1, "<table>")
    sc.parallelize(1 to 10).foreach(i => a += i)
    ```
    
    Before this fix,
    
    <img width="1348" alt="screen shot 2015-07-13 at 8 02 44 pm" src="https://cloud.githubusercontent.com/assets/1000778/8649295/b17c491e-299b-11e5-97ee-4e6a64074c4f.png">
    
    After this fix,
    
    <img width="1355" alt="screen shot 2015-07-13 at 8 14 32 pm" src="https://cloud.githubusercontent.com/assets/1000778/8649337/f9e9c9ec-299b-11e5-927e-35c0a2f897f5.png">
    
    Author: zsxwing <[email protected]>
    
    Closes apache#7369 from zsxwing/SPARK-9012 and squashes the following commits:
    
    a83c9b6 [zsxwing] Escape Accumulators in the task table
    
    (cherry picked from commit adb33d3)
    Signed-off-by: Kousuke Saruta <[email protected]>
    zsxwing authored and sarutak committed Jul 15, 2015
    Configuration menu
    Copy the full SHA
    1093992 View commit details
    Browse the repository at this point in the history
  2. [SPARK-7555] [DOCS] Add doc for elastic net in ml-guide and mllib-guide

    jkbradley I put the elastic net under the **Algorithm guide** section. Also add the formula of elastic net in mllib-linear `mllib-linear-methods#regularizers`.
    
    dbtsai I left the code tab for you to add example code. Do you think it is the right place?
    
    Author: Shuo Xiang <[email protected]>
    
    Closes apache#6504 from coderxiang/elasticnet and squashes the following commits:
    
    f6061ee [Shuo Xiang] typo
    90a7c88 [Shuo Xiang] Merge remote-tracking branch 'upstream/master' into elasticnet
    0610a36 [Shuo Xiang] move out the elastic net to ml-linear-methods
    8747190 [Shuo Xiang] merge master
    706d3f7 [Shuo Xiang] add python code
    9bc2b4c [Shuo Xiang] typo
    db32a60 [Shuo Xiang] java code sample
    aab3b3a [Shuo Xiang] Merge remote-tracking branch 'upstream/master' into elasticnet
    a0dae07 [Shuo Xiang] simplify code
    d8616fd [Shuo Xiang] Update the definition of elastic net. Add scala code; Mention Lasso and Ridge
    df5bd14 [Shuo Xiang] use wikipeida page in ml-linear-methods.md
    78d9366 [Shuo Xiang] address comments
    8ce37c2 [Shuo Xiang] Merge branch 'elasticnet' of github.com:coderxiang/spark into elasticnet
    8f24848 [Shuo Xiang] Merge branch 'elastic-net-doc' of github.com:coderxiang/spark into elastic-net-doc
    998d766 [Shuo Xiang] Merge branch 'elastic-net-doc' of github.com:coderxiang/spark into elastic-net-doc
    89f10e4 [Shuo Xiang] Merge remote-tracking branch 'upstream/master' into elastic-net-doc
    9262a72 [Shuo Xiang] update
    7e07d12 [Shuo Xiang] update
    b32f21a [Shuo Xiang] add doc for elastic net in sparkml
    937eef1 [Shuo Xiang] Merge remote-tracking branch 'upstream/master' into elastic-net-doc
    180b496 [Shuo Xiang] Merge remote-tracking branch 'upstream/master'
    aa0717d [Shuo Xiang] Merge remote-tracking branch 'upstream/master'
    5f109b4 [Shuo Xiang] Merge remote-tracking branch 'upstream/master'
    c5c5bfe [Shuo Xiang] Merge remote-tracking branch 'upstream/master'
    98804c9 [Shuo Xiang] fix bug in topBykey and update test
    
    (cherry picked from commit 303c120)
    Signed-off-by: Joseph K. Bradley <[email protected]>
    coderxiang authored and jkbradley committed Jul 15, 2015
    Configuration menu
    Copy the full SHA
    5b5693d View commit details
    Browse the repository at this point in the history
  3. [SPARK-8974] Catch exceptions in allocation schedule task.

    I meet a problem. When I submit some tasks, the thread spark-dynamic-executor-allocation should seed the message about "requestTotalExecutors", and the new executor should start. But I meet a problem about this thread, like:
    
    2015-07-14 19:02:17,461 | WARN  | [spark-dynamic-executor-allocation] | Error sending message [message = RequestExecutors(1)] in 1 attempts
    java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
            at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
            at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
            at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
            at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
            at scala.concurrent.Await$.result(package.scala:107)
            at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
            at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78)
            at org.apache.spark.scheduler.cluster.YarnSchedulerBackend.doRequestTotalExecutors(YarnSchedulerBackend.scala:57)
            at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.requestTotalExecutors(CoarseGrainedSchedulerBackend.scala:351)
            at org.apache.spark.SparkContext.requestTotalExecutors(SparkContext.scala:1382)
            at org.apache.spark.ExecutorAllocationManager.addExecutors(ExecutorAllocationManager.scala:343)
            at org.apache.spark.ExecutorAllocationManager.updateAndSyncNumExecutorsTarget(ExecutorAllocationManager.scala:295)
            at org.apache.spark.ExecutorAllocationManager.org$apache$spark$ExecutorAllocationManager$$schedule(ExecutorAllocationManager.scala:248)
    
    when after some minutes, I find a new ApplicationMaster start,  and tasks submitted start to run. The tasks Completed. And after long time (eg, ten minutes), the number of executor  does not reduce to zero.  I use the default value of "spark.dynamicAllocation.minExecutors".
    
    Author: KaiXinXiaoLei <[email protected]>
    
    Closes apache#7352 from KaiXinXiaoLei/dym and squashes the following commits:
    
    3603631 [KaiXinXiaoLei] change logError to logWarning
    efc4f24 [KaiXinXiaoLei] change file
    
    (cherry picked from commit 674eb2a)
    Signed-off-by: Sean Owen <[email protected]>
    KaiXinXiaoLei authored and srowen committed Jul 15, 2015
    Configuration menu
    Copy the full SHA
    bb14015 View commit details
    Browse the repository at this point in the history

Commits on Jul 17, 2015

  1. Configuration menu
    Copy the full SHA
    0715408 View commit details
    Browse the repository at this point in the history