Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array out-of-bound access bug #5492

Closed
zhangzhang10 opened this issue Apr 6, 2020 · 1 comment
Closed

Array out-of-bound access bug #5492

zhangzhang10 opened this issue Apr 6, 2020 · 1 comment

Comments

@zhangzhang10
Copy link
Contributor

In file src/common/quantil.h, at line 210-211:

CHECK(i != src.size - 1);
if (dx2 < src.data[i].RMinNext() + src.data[i + 1].RMaxPrev()) { ... ... }

The CHECK logs an error message if i == src.size - 1, then execution continues to the next line where src.data[i + 1] is accessed. This appears to be an out-of-bound array access error. Using a large dataset, e.g. a 64GB mortgage dataset, in distributed training on Spark, we see task failures that can be attributed to this bug.

The two lines of code mentioned above are found in function WQSummary::SetPrune(), which have been around for years, but the problem manifests itself only recently when this PR was merged. One thing the PR changed was switching from WXQSketch to WQSketch. As a result, WQSummary::SetPrune() replaced WXQSummary::SetPrune() in the execution path. In WXQSummary::SetPrune(), there was a similar check, but it breaks out of the enclosing for-loop instead of continuing when the check fails, see line 425-426 in file quantile.h:

if (i == end) break;
if (dx2 < src.data[i].RMinNext() + src.data[i + 1].RMaxPrev()) { ... ... }

I believe we should do the same (breaking from the for-loop) in WQSummary::SetPrune(). Thanks.

@RAMitchell
Copy link
Member

Would you like to submit a pr?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants