-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19862] 'tungsten-sort' should be deleted in SparkEnv.scala #17220
Conversation
…uffle.unsafe.UnsafeShuffleManager'.
Can one of the admins verify this patch? |
If this was an exposed parameter, we cannot remove it - irrespective of the duplication. |
In spark 1.4.1, you delete the parameter of 'hash'.I think it should be deleted.In the spark website indicated in the documents, should not keep this logic in the code. 1.4.1**** |
Can you please fix the title like what other PR did. |
Ok, I have modified the title. |
Is this change even correct? This is here for backward compatibility. |
I think the compatibility, the resulting shuffle manager is not I want.Only the parameter values' sort real SortShuffleManager said. |
I don't think you understand this. This value is here so if at some point some user picked tungsten-sort, we won't break it. In recent versions of Spark the default sort manager accomplishes the thing as the old tungsten sort. |
If anything, we should just update the file to add a line of comment to make sure people don't delete this in the future. |
I think I should delete, update in the document at the same time, so that to ensure the uniqueness of function. |
@guoxiaolongzte , I think here though "tungsten-sort" is the same as "sort" now, for the configuration backward compatibility we still need to keep it. If somehow user still configured with "tungsten-sort", with your change the application will be failed. I think that's what @rxin mentioned about. Looking at some other configurations, we typically keep backward compatibility unless major release (Spark 1.6 to Spark 2.0). |
thanks.I understand this. |
why HashShuffleManager have been deleted. |
Hash-based shuffle has some problems with large number of partitions, and part of hash-based shuffle's feature has already been incorporated into sort-based shuffle. Spark's sort-based shuffle is not pure sort-based like MR, it is actually a mixed pattern depends on the partition numbers. |
Closes apache#16819 Closes apache#13467 Closes apache#16083 Closes apache#17135 Closes apache#8785 Closes apache#16278 Closes apache#16997 Closes apache#17073 Closes apache#17220
@jerryshao@rxin@srowen |
What's the meaning of "has been deleted in Spark 2.1.0"? I think the reason mention above is quite clear. |
spark2.0.2 spark2.1.0 The above is based on the analysis of release the original. |
Well, I still saw "tungsten-sort" in branch 2.1 and master (https://github.com/apache/spark/blob/branch-2.1/core/src/main/scala/org/apache/spark/SparkEnv.scala#L320). Can you tell which code did you check? |
Sorry,i watch the wrong,i download the master of the branch 2.1.My issue also mentioned to remove the code, and should not be Resolution: Won't Fix. |
JIRA Issue: https://github.com/guoxiaolongzte/spark/tree/SPARK-19862
In SparkEnv.scala,remove tungsten-sort.Because it is not represent 'org.apache.spark.shuffle.unsafe.UnsafeShuffleManager'.So it should by delete.Only in this way, it will not cause user ambiguity.