Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AdjacencyList offset array can overflow #2624

Closed
chrisrichardson opened this issue Apr 19, 2023 · 5 comments · Fixed by #2641
Closed

AdjacencyList offset array can overflow #2624

chrisrichardson opened this issue Apr 19, 2023 · 5 comments · Fixed by #2641
Labels
bug Something isn't working

Comments

@chrisrichardson
Copy link
Contributor

How to reproduce the bug

Running a problem of sufficient size on one process (e.g. Q2 problem on mesh with 60 million dofs) can overwhelm the range of AdjacencyList.

  • When running e.g. on a GPU system, we may have many degrees of freedom on one process.
  • If the number of non-zeros in the local sparsity pattern exceeds 2^31, the AdjacencyList offsets used when building the sparsity pattern will overflow.

Possible solution: use 64-bit indexing for offsets in AdjacencyList or template it.

Minimal Example (Python)

No response

Output (Python)

No response

Version

main branch

DOLFINx git commit

No response

Installation

No response

Additional information

No response

@chrisrichardson chrisrichardson added the bug Something isn't working label Apr 19, 2023
@jorgensd
Copy link
Sponsor Member

Oh dear.
I would say change it to int64, as I think we have introduce too many templates already (and I do not see a large benefit of this being templated).

@chrisrichardson
Copy link
Contributor Author

I'm going to think about other solutions: we only really need it in a few places (sparsity pattern and MatrixCSR) so I don't think we want to make it int64 everywhere...

@jorgensd
Copy link
Sponsor Member

I'm going to think about other solutions: we only really need it in a few places (sparsity pattern and MatrixCSR) so I don't think we want to make it int64 everywhere...

Then I guess we want to template it, but maybe make some wrappers already at the C++ level to avoid everything being moved into header files.

@chrisrichardson
Copy link
Contributor Author

I think I'm going to close this. On reflection, it is only useful when we have >50M dofs, and the associated matrix will occupy around 64Gb of memory, which is close to the limit, at the moment. We should however, have some kind of test to make sure we are not overflowing when creating sparsity patterns or matrices.

@garth-wells
Copy link
Member

This should be fixed at some point. The simplest approach might be (1) avoid AdjacencyList when we don't really need it (see #2632), and then (2) store the offsets in a std::vector<std::size_t>.

@garth-wells garth-wells reopened this Apr 21, 2023
@garth-wells garth-wells changed the title [BUG]: AdjacencyList uses 32-bit indexing, but sometimes we need 64-bit AdjacencyList offset array can overflow Apr 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants