Skip to content

Commit

Permalink
[SPARK-26950][SQL][TEST] Make RandomDataGenerator use Float.NaN or Do…
Browse files Browse the repository at this point in the history
…uble.NaN for all NaN values

## What changes were proposed in this pull request?

Apache Spark uses the predefined `Float.NaN` and `Double.NaN` for NaN values, but there exists more NaN values with different binary presentations.

```scala
scala> java.nio.ByteBuffer.allocate(4).putFloat(Float.NaN).array
res1: Array[Byte] = Array(127, -64, 0, 0)

scala> val x = java.lang.Float.intBitsToFloat(-6966608)
x: Float = NaN

scala> java.nio.ByteBuffer.allocate(4).putFloat(x).array
res2: Array[Byte] = Array(-1, -107, -78, -80)
```

Since users can have these values, `RandomDataGenerator` generates these NaN values. However, this causes `checkEvaluationWithUnsafeProjection` failures due to the difference between `UnsafeRow` binary presentation. The following is the UT failure instance. This PR aims to fix this UT flakiness.

- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102528/testReport/

## How was this patch tested?

Pass the Jenkins with the newly added test cases.

Closes apache#23851 from dongjoon-hyun/SPARK-26950.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit ffef3d4)
Signed-off-by: Wenchen Fan <[email protected]>
  • Loading branch information
dongjoon-hyun authored and cloud-fan committed Feb 22, 2019
1 parent b403612 commit ef67be3
Show file tree
Hide file tree
Showing 2 changed files with 53 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,6 @@

package org.apache.spark.sql

import java.lang.Double.longBitsToDouble
import java.lang.Float.intBitsToFloat
import java.math.MathContext

import scala.collection.mutable
Expand Down Expand Up @@ -69,6 +67,28 @@ object RandomDataGenerator {
Some(f)
}

/**
* A wrapper of Float.intBitsToFloat to use a unique NaN value for all NaN values.
* This prevents `checkEvaluationWithUnsafeProjection` from failing due to
* the difference between `UnsafeRow` binary presentation for NaN.
* This is visible for testing.
*/
def intBitsToFloat(bits: Int): Float = {
val value = java.lang.Float.intBitsToFloat(bits)
if (value.isNaN) Float.NaN else value
}

/**
* A wrapper of Double.longBitsToDouble to use a unique NaN value for all NaN values.
* This prevents `checkEvaluationWithUnsafeProjection` from failing due to
* the difference between `UnsafeRow` binary presentation for NaN.
* This is visible for testing.
*/
def longBitsToDouble(bits: Long): Double = {
val value = java.lang.Double.longBitsToDouble(bits)
if (value.isNaN) Double.NaN else value
}

/**
* Returns a randomly generated schema, based on the given accepted types.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@

package org.apache.spark.sql

import java.nio.ByteBuffer
import java.util.Arrays

import scala.util.Random

import org.apache.spark.SparkFunSuite
Expand Down Expand Up @@ -106,4 +109,32 @@ class RandomDataGeneratorSuite extends SparkFunSuite {
assert(deviation.toDouble / expectedTotalElements < 2e-1)
}
}

test("Use Float.NaN for all NaN values") {
val bits = -6966608
val nan1 = java.lang.Float.intBitsToFloat(bits)
val nan2 = RandomDataGenerator.intBitsToFloat(bits)
assert(nan1.isNaN)
assert(nan2.isNaN)

val arrayExpected = ByteBuffer.allocate(4).putFloat(Float.NaN).array
val array1 = ByteBuffer.allocate(4).putFloat(nan1).array
val array2 = ByteBuffer.allocate(4).putFloat(nan2).array
assert(!Arrays.equals(array1, arrayExpected))
assert(Arrays.equals(array2, arrayExpected))
}

test("Use Double.NaN for all NaN values") {
val bits = -6966608
val nan1 = java.lang.Double.longBitsToDouble(bits)
val nan2 = RandomDataGenerator.longBitsToDouble(bits)
assert(nan1.isNaN)
assert(nan2.isNaN)

val arrayExpected = ByteBuffer.allocate(8).putDouble(Double.NaN).array
val array1 = ByteBuffer.allocate(8).putDouble(nan1).array
val array2 = ByteBuffer.allocate(8).putDouble(nan2).array
assert(!Arrays.equals(array1, arrayExpected))
assert(Arrays.equals(array2, arrayExpected))
}
}

0 comments on commit ef67be3

Please sign in to comment.