Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-12031][Core][BUG]: Integer overflow when do sampling #10023

Closed
wants to merge 4 commits into from

Conversation

uncleGen
Copy link
Contributor

No description provided.

@uncleGen
Copy link
Contributor Author

Jenkins, test this please

val rand = new XORShiftRandom(seed)
while (input.hasNext) {
val item = input.next()
val replacementIndex = rand.nextInt(i)
val replacementIndex = l < Int.MaxValue match {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the change is fine except that the case true business strikes me as pointlessly verbose Scala. if else is much clearer.

@SparkQA
Copy link

SparkQA commented Nov 28, 2015

Test build #46836 has finished for PR 10023 at commit 986fa5c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@uncleGen
Copy link
Contributor Author

[info] - failing to fetch classes from HTTP server should not leak resources (SPARK-6209) *** FAILED *** (1 second, 392 milliseconds)
[info] java.lang.ClassNotFoundException: ReplFakeClass2
[info] at org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:84)
[info] at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
[info] at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
[info] at org.apache.spark.repl.ExecutorClassLoaderSuite$$anonfun$7.apply$mcV$sp(ExecutorClassLoaderSuite.scala:148)
[info] at org.apache.spark.repl.ExecutorClassLoaderSuite$$anonfun$7.apply(ExecutorClassLoaderSuite.scala:131)
[info] at org.apache.spark.repl.ExecutorClassLoaderSuite$$anonfun$7.apply(ExecutorClassLoaderSuite.scala:131)
[info] at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
[info] at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
[info] at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
[info] at org.scalatest.Transformer.apply(Transformer.scala:22)
[info] at org.scalatest.Transformer.apply(Transformer.scala:20)
[info] at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
[info] at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:42)
....

@SparkQA
Copy link

SparkQA commented Nov 28, 2015

Test build #46837 has finished for PR 10023 at commit 986fa5c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val replacementIndex = if (l < Int.MaxValue) {
rand.nextInt(l.toInt)
} else {
rand.nextLong()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't valid, because it chooses a value from (negative) Long.MIN_VALUE to Long.MAX_VALUE. You need to choose a number in [0, l).

Rather than even special case it, just use

val replacementIndex = (random.nextDouble() * l).toLong

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@SparkQA
Copy link

SparkQA commented Nov 28, 2015

Test build #46838 has finished for PR 10023 at commit bb2ed41.

  • This patch fails from timeout after a configured wait of 250m.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 30, 2015

Test build #46866 has finished for PR 10023 at commit 3cafda5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Nov 30, 2015

LGTM. CC @rxin for a look if possible

val rand = new XORShiftRandom(seed)
while (input.hasNext) {
val item = input.next()
val replacementIndex = rand.nextInt(i)
val replacementIndex = (rand.nextDouble() * l).toLong
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @mengxr

@srowen
Copy link
Member

srowen commented Dec 1, 2015

@mengxr I'm going to merge tomorrow to get this fix in for 1.6

@mengxr
Copy link
Contributor

mengxr commented Dec 1, 2015

@uncleGen How many items did you have to trigger this error? We assume that there are less than Int.MaxValue items in a single partition.

@uncleGen
Copy link
Contributor Author

uncleGen commented Dec 2, 2015

@mengxr In my case, I do 10TB data sort in 96 partitions. When do range partition, "java.lang.IllegalArgumentException: n must be positive" exception was thrown. At that time, I saw the partition contained 10.3GB items, and the number of items exceeded the "Int.MaxValue" limit

@uncleGen
Copy link
Contributor Author

uncleGen commented Dec 2, 2015

@mengxr sorry, was 48 executors with 2 cores, and 1024 partitions.

@uncleGen
Copy link
Contributor Author

uncleGen commented Dec 4, 2015

@srowen @mengxr any problems?

@srowen
Copy link
Member

srowen commented Dec 4, 2015

@mengxr is that really an assumption that the code makes? the underlying partitions in, say, HDFS can certainly have more and @uncleGen 's reproduction seems legitimate. We could make a better error and forbid it though it seems maybe OK to support it anyway in this method?

@srowen
Copy link
Member

srowen commented Dec 7, 2015

@mengxr I'd like to merge this on the grounds that it can't hurt at the worst, and solves a problem that either is real now or could come up later.

@SparkQA
Copy link

SparkQA commented Dec 8, 2015

Test build #2181 has finished for PR 10023 at commit 3cafda5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Dec 9, 2015
Author: uncleGen <[email protected]>

Closes #10023 from uncleGen/1.6-bugfix.

(cherry picked from commit a113216)
Signed-off-by: Sean Owen <[email protected]>
@srowen
Copy link
Member

srowen commented Dec 9, 2015

Merged to master/1.6

@asfgit asfgit closed this in a113216 Dec 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants