-
Notifications
You must be signed in to change notification settings - Fork 647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AtomGroup doesn't rebind with each timestep after dask update #3931
Comments
@yuxuanzhuang have you encountered this problem? |
I don't have an answer but we emphatically do not want to execute a selection protein = u.select_atoms('protein') # This works in an inner loop. That's terrible for performance. |
I am not able to pinpoint the culprit from dozens of dask changes (it boils down to how dask serializes stuff e.g. if I set @dask.delayed
def analyze_block(blockslice, universe, func, *args, **kwargs):
result = []
for ts in universe.trajectory[blockslice.start:blockslice.stop]:
A = func(*args, **kwargs)
result.append(A)
return result jobs = []
for bs in blocks:
jobs.append(analyze_block(bs,
u,
radgyr,
protein,
protein.masses,
total_mass=np.sum(protein.masses)))
jobs = dask.delayed(jobs) IMHO it's probably even clearer than before by explicitly setting the |
This means we need to change docs, right? |
I am going to remove the defect label because it seems a downstream dask issue (maybe they don't pull in objects from the context anymore?). It looks as if this (admittedly annoying and code-breaking) problem can be "fixed" by documentation changes. Is there anything else we ought to be doing? |
Including the |
Updating the userguide and testing on dask |
Confirming the issue has been resolved in versions from v2023.1.1 onward. The versions affected by the problem range from v2021.4.1 to v2023.1.0. |
I suspect the fix might be related to the changes seen here: dask/dask@2023.1.0...2023.1.1#diff-b2b064ba4d14c2c4d3c14bb57d0cda3849f084625015bb61f2ec1f7ffe8195ccR1166 (though I'm not entirely certain). Perhaps we can implement a solution downstream to prevent this issue in the future. |
I can't reproduce this section of the user guide currently, apparently due to some breaking change in Dask.
I can't point to the exact Dask version in which this broke, but it works onIt works up to2021.1.0
, but is broken on2022.6.0
onwards.2021.4.0
, the breaking version is2021.4.1
. Here's the changelog.Before
Now
That is, the radii of gyration are all the same. They all come from the timestep 0.
Code
Given this:
Where
protein
is an AtomGroup:protein = u.select_atoms('protein')
While the timestep does update, the
AtomGroup
keeps bound to the first timestep.A way to go around this is updating the
protein
selection and calling theradgyr()
function with this new selection:MDA:
2.4.0-dev0
The text was updated successfully, but these errors were encountered: