Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Virtual sites coordinates folding breaks with periodic boundary conditions #4703

Closed
christophlohrmann opened this issue Apr 3, 2023 · 9 comments · Fixed by #4707
Closed

Comments

@christophlohrmann
Copy link
Contributor

christophlohrmann commented Apr 3, 2023

import espressomd
import espressomd.virtual_sites 
import numpy as np


vs_distance = 2
system = espressomd.System(box_l =3*[100])
system.cell_system.skin = 0.1
system.time_step = 0.01
system.min_global_cut = 1.5 * vs_distance
system.virtual_sites = espressomd.virtual_sites.VirtualSitesRelative()

real_part = system.part.add(pos=[10.,10.,vs_distance/2.],
                            director=[0,0,1])
virt_part = system.part.add(pos=[10.,10.,-vs_distance/2.],
                            virtual=True)
virt_part.vs_auto_relate_to(real_part)
np.testing.assert_allclose(np.linalg.norm(real_part.pos-virt_part.pos), vs_distance)

system.integrator.run(0)
np.testing.assert_allclose(np.linalg.norm(real_part.pos-virt_part.pos), vs_distance)

real_part.director = [0,0,-1]
system.integrator.run(0)
np.testing.assert_allclose(np.linalg.norm(real_part.pos-virt_part.pos), vs_distance)

This has a similar feel to the virtual sites that teleported through the boxes in Simon K's issue #4604 from a while back.

@jngrad
Copy link
Member

jngrad commented Apr 6, 2023

Not sure I understand the problem statement. With periodic boundary conditions, particle coordinates are folded when negative. The virtual site is initially at position [10, 10, -1], which after one integration with 0 time steps becomes equal to position [10, 10, 99] because box_l=100. Replacing all the np.linalg.norm(real_part.pos-virt_part.pos) by system.distance(real_part, virt_part) fixes the MWE.

What is unclear to me though, is why the image count wildly varies during integration:

import espressomd
import espressomd.virtual_sites 

vs_distance = 2
system = espressomd.System(box_l =3*[100])
system.cell_system.skin = 0.1
system.time_step = 0.01
system.min_global_cut = 1.5 * vs_distance
system.virtual_sites = espressomd.virtual_sites.VirtualSitesRelative()

real_part = system.part.add(pos=[10.,10.,vs_distance/2.], director=[0,0,1])
virt_part = system.part.add(pos=[10.,10.,-vs_distance/2.], virtual=True)
virt_part.vs_auto_relate_to(real_part)

print("pos unfolded:", system.part.all().pos.tolist())
print("pos folded:  ", system.part.all().pos_folded.tolist())
print("image box:   ", system.part.all().image_box.tolist())
print("distance:    ", system.distance(real_part, virt_part))

print("system.integrator.run(0)")
system.integrator.run(0)

print("pos unfolded:", system.part.all().pos.tolist())
print("pos folded:  ", system.part.all().pos_folded.tolist())
print("image box:   ", system.part.all().image_box.tolist())
print("distance:    ", system.distance(real_part, virt_part))

print("real_part.director = [0,0,-1]")
real_part.director = [0,0,-1]
print("system.integrator.run(0)")
system.integrator.run(0)

print("pos unfolded:", system.part.all().pos.tolist())
print("pos folded:  ", system.part.all().pos_folded.tolist())
print("image box:   ", system.part.all().image_box.tolist())
print("distance:    ", system.distance(real_part, virt_part))

Output on the walberla branch and on the 4.2.1 bugfix release candidate:

pos unfolded: [[10.0, 10.0, 1.0], [10.0, 10.0, -1.0]]
pos folded:   [[10.0, 10.0, 1.0], [10.0, 10.0, 99.0]]
image box:    [[0, 0, 0], [0, 0, -1]]
distance:     2.0
system.integrator.run(0)
pos unfolded: [[10.0, 10.0, 1.0], [10.0, 10.0, -301.0]]
pos folded:   [[10.0, 10.0, 1.0], [10.0, 10.0, 99.0]]
image box:    [[0, 0, 0], [0, 0, -3]]
distance:     2.0
real_part.director = [0,0,-1]
system.integrator.run(0)
pos unfolded: [[10.0, 10.0, 1.0], [10.0, 10.0, -397.0]]
pos folded:   [[10.0, 10.0, 1.0], [10.0, 10.0, 3.0]]
image box:    [[0, 0, 0], [0, 0, -4]]
distance:     2.0

This looks suspicious to me.

Output on the python branch:

pos unfolded: [[10.0, 10.0, 1.0], [10.0, 10.0, -1.0]]
pos folded:   [[10.0, 10.0, 1.0], [10.0, 10.0, 99.0]]
image box:    [[0, 0, 0], [0, 0, -1]]
distance:     2.0
system.integrator.run(0)
pos unfolded: [[10.0, 10.0, 1.0], [10.0, 10.0, 99.0]]
pos folded:   [[10.0, 10.0, 1.0], [10.0, 10.0, 99.0]]
image box:    [[0, 0, 0], [0, 0, 0]]
distance:     2.0
real_part.director = [0,0,-1]
system.integrator.run(0)
pos unfolded: [[10.0, 10.0, 1.0], [10.0, 10.0, 3.0]]
pos folded:   [[10.0, 10.0, 1.0], [10.0, 10.0, 3.0]]
image box:    [[0, 0, 0], [0, 0, 0]]
distance:     2.0

This looks much better.

@RudolfWeeber @pkreissl do you have an opinion on this?

@christophlohrmann
Copy link
Contributor Author

regarding the -1 to 99 change: I know that the folded position comes out correctly, but I would like the unfolded position to remain -1. Are positions folded into the original box when I add particles to the system?
It seems like the the connection to the real particle was done correctly because the particle moves to the correct (folded) position when flipped. That means that simulations probably produce correct physics as long as one does not need the actual, unfolded trajectory of the virtual site.

@jngrad
Copy link
Member

jngrad commented Apr 11, 2023

The regression was introduced in 4.2.1, python and walberla by #4564. The VirtualSitesRelative::update() method no longer updates the image_box, which monotonically increases or decreases with the number of integration steps, and the unfolded coordinates grow rapidly. We need to figure out what should happen when Lees-Edwards boundary conditions are used. When updating the image box counter on one direction, the position in another direction may have to change by a time-dependent offset, in which case the image box of that direction must be updated too.

@jngrad
Copy link
Member

jngrad commented Apr 11, 2023

I wrote a unit test in python to check corner cases in aec1756, but I've barely scratched the surface. It doesn't consider e.g. the case where the LE shear offset makes either one of the particles, or both particles, go through an image box in the shear direction after application of the offset (we would need to write 8 cases), or the case where particles are initially at [0, 0, 0] instead of [1e-7, 0, 0], i.e. to cover the case of the LE offset being automatically applied at particle creation (due to inconsistent folding rules with the regular particle folding mechanism). Fully testing all corner cases would require writing at least 32 cases per direction, with 20 lines per case. To check the folding code after the real particle director gets flipped, one would need to multiply that number by 4.

I don't think it is feasible. The current unit test only has 8 cases for LEbc, but it is already quite hard to read. Also, when running the unit test with multiple MPI ranks, the image box values are different, and I'm not sure why.

In addition to that, the behavior requested in the OP would require not storing the minimal distance between the real particle and virtual site, but rather storing the distance in unfolded coordinates. This leads to new issues:

  • if the box size changes (NpT or user action), the distance gets invalidated, so no more box size change as long as there is at least 1 vs_rel in the system
  • if the LE offset changes (time-dependent shear or user action), the distance gets invalidated too
  • it is unclear, what the initial distance is supposed to be when LEbc are active, because particle coordinates folding happens in 2 different places when integrating with 0 steps
  • if the unfolded distance is large, the numerical imprecision of the quaternion product code will be magnified

Yet, the current code cannot stay, because the image box counter keeps increasing at every time step.

@christophlohrmann
Copy link
Contributor Author

the way I read this is that LEbc and virtual sites do not play nice together. Would it be too bad if the two features were mutually exclusive?

@jngrad
Copy link
Member

jngrad commented Apr 11, 2023

@bindgens1 @RudolfWeeber do you have any plans for virtual sites + LEbc?

@jngrad
Copy link
Member

jngrad commented Apr 12, 2023

@christophlohrmann resp. @bindgens1 could you please confirm the bugfix in branches jngrad/vs_rel_folding_421 and jngrad/vs_rel_folding_walberla fixed the virtual sites relative folding bug in 4.2.1 and walberla, and doesn't break bacteria resp. Lees-Edwards simulations? We need your feedback quickly, so that we can move forward with the release.

@christophlohrmann
Copy link
Contributor Author

christophlohrmann commented Apr 12, 2023

@jngrad your commit fixes the flip bug and passes all my bacteria-tests
EDIT: on one core, with multiple mpi-ranks it still fails

@jngrad
Copy link
Member

jngrad commented Apr 12, 2023

EDIT: on one core, with multiple mpi-ranks it still fails

The ghost communication code seems to be broken. The image_box is not carried over MPI domains, hence we silently end up with a default-constructed image_box = [0, 0, 0] in a ghost particle when the corresponding real particle goes through a boundary, and then the virtual sites uses the default-constructed value instead of the correct value.

I've updated both branches with a bugfix.

@jngrad jngrad added this to the Espresso 4.2.1 milestone Apr 13, 2023
@jngrad jngrad changed the title walberla-branch: virtual sites across periodic boundaries are broken Virtual sites coordinates folding breaks with periodic boundary conditions Apr 13, 2023
@kodiakhq kodiakhq bot closed this as completed in #4707 Apr 14, 2023
kodiakhq bot added a commit that referenced this issue Apr 14, 2023
Fixes #4703

Description of changes:
- bugfix: virtual sites relative are now properly folded again (the regression was introduced by #4564)
- bugfix: uninitialized virtual sites now throw a runtime error instead of implicitly tracking the particle with pid=0
- write more thorough tests for virtual sites relative: integration through periodic boundaries, checkpointing
jngrad pushed a commit to jngrad/espresso that referenced this issue Apr 14, 2023
Fixes espressomd#4703

Description of changes:
- bugfix: virtual sites relative are now properly folded again (the regression was introduced by espressomd#4564)
- bugfix: uninitialized virtual sites now throw a runtime error instead of implicitly tracking the particle with pid=0
- write more thorough tests for virtual sites relative: integration through periodic boundaries, checkpointing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants