Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Memory access error in IVFPQ unit test #3318

Closed
wphicks opened this issue Dec 17, 2020 · 7 comments
Closed

[BUG] Memory access error in IVFPQ unit test #3318

wphicks opened this issue Dec 17, 2020 · 7 comments
Assignees
Labels
bug Something isn't working tests Unit testing for project

Comments

@wphicks
Copy link
Contributor

wphicks commented Dec 17, 2020

Describe the bug
As shown in the CI log for this apparently unrelated PR, one of the IVFPQ unit tests fails with a memory access error in FAISS code. I have not yet been able to reproduce this independently, but I'm opening this issue to document recurrences or anything else we find related to this problem.

@wphicks wphicks added bug Something isn't working ? - Needs Triage Need team to review and classify labels Dec 17, 2020
@hcho3 hcho3 added tests Unit testing for project and removed ? - Needs Triage Need team to review and classify labels Dec 17, 2020
@viclafargue viclafargue self-assigned this Jan 19, 2021
rapids-bot bot pushed a commit that referenced this issue Jan 21, 2021
Answers #3318
This may fix the error observed in CI.

Before the change, the memory manager handler was released first, then the FAISS index.
After the change, the FAISS index is released first, then the memory manager handler is released.

Authors:
  - Victor Lafargue (@viclafargue)

Approvers:
  - John Zedlewski (@JohnZed)

URL: #3391
@viclafargue
Copy link
Contributor

Answered by #3391

@wphicks
Copy link
Contributor Author

wphicks commented Jan 25, 2021

Unfortunately, even with this fix I still saw the memory access error in CI for #3304. I'm going to reopen this for now to facilitate further discussion.

@JohnZed
Copy link
Contributor

JohnZed commented Jan 28, 2021

  • Observed failures on V100 on Centos with CUDA 10.2
  • Also A100 CUDA 11 on ubuntu 18.04
    Believed to happen on others as well

@wphicks
Copy link
Contributor Author

wphicks commented Jan 28, 2021

Also observed on CUDA 11 on #3409

@viclafargue
Copy link
Contributor

Here is statement causing the invalid read. By looking into FAISS issues, I came across this issue mentioning this exact same statement. The problem has already been identified (even though not experienced) and a fix for it is available from FAISS 1.6.4. Basically, the bounds weren't checked.

rapids-bot bot pushed a commit that referenced this issue Feb 10, 2021
Following observations in #3318

Authors:
  - Victor Lafargue (@viclafargue)

Approvers:
  - John Zedlewski (@JohnZed)

URL: #3472
@viclafargue
Copy link
Contributor

A warning was added in #3472. FAISS should be updated to a version that contains a fix for this.

@wphicks
Copy link
Contributor Author

wphicks commented Feb 11, 2021

For clarity, #3459 tracks the underlying problem

rapids-bot bot pushed a commit that referenced this issue Jul 28, 2021
With the update to FAISS 1.7, the [previously observed issue with IVFPQ](#3318) has disappeared. However some other issues were recently identified in ANN methods. This PR updates the relevant warnings and pytests accordingly.

Authors:
  - Victor Lafargue (https://github.com/viclafargue)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4101
vimarsh6739 pushed a commit to vimarsh6739/cuml that referenced this issue Oct 9, 2023
With the update to FAISS 1.7, the [previously observed issue with IVFPQ](rapidsai#3318) has disappeared. However some other issues were recently identified in ANN methods. This PR updates the relevant warnings and pytests accordingly.

Authors:
  - Victor Lafargue (https://github.com/viclafargue)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: rapidsai#4101
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working tests Unit testing for project
Projects
None yet
Development

No branches or pull requests

4 participants