Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bcast sending wrong buffer #784

Closed
ClaudiaComito opened this issue Jun 2, 2021 · 2 comments
Closed

Bcast sending wrong buffer #784

ClaudiaComito opened this issue Jun 2, 2021 · 2 comments
Assignees
Labels
bug Something isn't working MPI Anything related to MPI communication

Comments

@ClaudiaComito
Copy link
Contributor

ClaudiaComito commented Jun 2, 2021

Description
ht.communication.Bcast seems to send the wrong portion of a buffer when broadcasting a slice of an array that does not start at index 0.

To Reproduce

a = ht.arange(13*5, split=0).reshape((13, 5))
active_rank = 1
if a.comm.rank == active_rank:
    arr = a.larray[1:2]
else:
    arr = torch.empty((1,5), dtype=a.larray.dtype)
print("before Bcast on rank ", rank, ", arr = ", arr)
a.comm.Bcast(arr, root=active_rank)
print("after Bcast on rank ", rank, ", arr = ", arr)

mpirun -n 2

before Bcast on rank  1 , arr =  tensor([[40, 41, 42, 43, 44]], dtype=torch.int32)
before Bcast on rank  0 , arr =  tensor([[        1, 423460512,         1,         1,         0]], dtype=torch.int32)
after Bcast on rank  0 , arr =  tensor([[704643072, 721420288, 738197504, 754974720, 771751936]], dtype=torch.int32)
after Bcast on rank  1 , arr =  tensor([[40, 41, 42, 43, 44]], dtype=torch.int32)

Expected behavior

a = ht.arange(13*5, split=0).reshape((13, 5))
active_rank = 1
if a.comm.rank == active_rank:
    arr = a.larray[0:1]
else:
    arr = torch.empty((1,5), dtype=a.larray.dtype)
print("before Bcast on rank ", rank, ", arr = ", arr)
a.comm.Bcast(arr, root=active_rank)
print("after Bcast on rank ", rank, ", arr = ", arr)

mpirun -n 2

before Bcast on rank  0 , arr =  tensor([[         0, 1073741824,          0, 1073741824,  465895430]], dtype=torch.int32)
before Bcast on rank  1 , arr =  tensor([[40, 41, 42, 43, 44]], dtype=torch.int32)
after Bcast on rank  0 , arr =  tensor([[40, 41, 42, 43, 44]], dtype=torch.int32)
after Bcast on rank  1 , arr =  tensor([[40, 41, 42, 43, 44]], dtype=torch.int32)

Version Info
Main branch

@ClaudiaComito ClaudiaComito added bug Something isn't working MPI Anything related to MPI communication labels Jun 2, 2021
@ClaudiaComito
Copy link
Contributor Author

Might be similar or same issue as in #769

@ClaudiaComito
Copy link
Contributor Author

This has been addressed. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working MPI Anything related to MPI communication
Projects
None yet
Development

No branches or pull requests

2 participants