-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-3578] Fix upper bound in GraphGenerators.sampleLogNormal #2439
Conversation
QA tests have started for PR 2439 at commit
|
QA tests have finished for PR 2439 at commit
|
@jegonzal you should take a look :) |
@ankurdave I'd be a bit concerned about how that affects the correctness of the algorithm. Especially since this will round every value down when maybe you only want to round the edge case down. What do you think? Also, it might be good to add this to the unit tests -- call the function a 1000 times or something and verify that it never produces a value outside the expected range. |
This adds about 150 ms to the test duration.
QA tests have started for PR 2439 at commit
|
QA tests have finished for PR 2439 at commit
|
I looke at the Pregel paper but it doesn't specify and doesn't cite other After some thought, I think your solution is correct. With round(), integer I vote for the change. :) On Saturday, September 20, 2014, Ankur Dave [email protected]
em [email protected] |
This reverts commit 5900c22.
QA tests have started for PR 2439 at commit
|
QA tests have finished for PR 2439 at commit
|
This looks good to me. |
GraphGenerators.sampleLogNormal is supposed to return an integer strictly less than maxVal. However, it violates this guarantee. It generates its return value as follows:
When X is sampled to be close to (but less than) maxVal, then it will pass the while loop condition, but the rounded result will be equal to maxVal, which will violate the guarantee. For example, if maxVal is 5 and X is 4.9, then X < maxVal, but
math.round(X.toFloat)
is 5.This PR instead rounds X before checking the loop condition, guaranteeing that the condition will hold for the return value.