Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bisection bugs in dipolar P3M #3915

Open
jngrad opened this issue Sep 27, 2020 · 3 comments
Open

Bisection bugs in dipolar P3M #3915

jngrad opened this issue Sep 27, 2020 · 3 comments
Assignees
Labels

Comments

@jngrad
Copy link
Member

jngrad commented Sep 27, 2020

The dipolar P3M algorithm fails for specific particle configurations. The bug depends on the box size and operating system. MWE:

import espressomd
import espressomd.magnetostatics
BOX_L = 1
system = espressomd.System(box_l=3*[BOX_L])
system.time_step = 0.01
system.part.add(pos=[[0, 0, 0], [0.5, 0.5, 0.5]], rotation=2 * [(1, 1, 1)], dip=2 * [(1, 0, 0)])
solver = espressomd.magnetostatics.DipolarP3M(prefactor=1, accuracy=1e-2)
system.actors.add(solver)
  • with box size 1: ./pypresso mwe.py triggers an assertion Root must be bracketed for bisection in dp3m_rtbisection in function double dp3m_rtbisection(...), initially reported in Refactor exception mechanism in P3M/DP3M tuning functions #3869 (comment)
    • this assertion was introduced to prevent an infinite loop
  • with box size of 10: mpiexec -n 4 ./pypresso mwe.py triggers an assertion and generates a stack trace (mwe.log) randomly, depends on the operating system, reported in ROCm 3.8 docker#189 (comment)
    • with dip = 2 * [(0.1, 0, 0)], it instead triggers the bisection assertion

This issue makes the test_09_no_errors_dp3m_cpu() check in testsuite/python/p3m_tuning_exceptions.py fragile in CI, which prevents us from updating the ROCm image to Ubuntu 20.04.

@RudolfWeeber
Copy link
Contributor

RudolfWeeber commented Sep 27, 2020 via email

pkreissl added a commit to pkreissl/espresso that referenced this issue Sep 29, 2020
pstaerk added a commit to pstaerk/espresso that referenced this issue Sep 29, 2020
@jngrad
Copy link
Member Author

jngrad commented Oct 2, 2020

It now fails on ICP computer sheep at random. This could be a problem during the summer school.

@RudolfWeeber
Copy link
Contributor

My understanding is the following:

  • The error occurs in the bisection which finds the best alpha for a given real-space cutoff.
  • The error message means "You can't bisect if the evaluated function is not <0 on the one end and >0 on the other end of the bisection interval."
  • In the MWE, this is the case, because the real space error for the cutoff which comes out of the previous tuning steps is never small enough.
  • The maximum real space cutoff is limited by the size of the (mpi-local) box + skin Otherwise pairs would be missed in the short range loop for the real space part.

I assume that the previous tuning steps, which determins the number of mesh points, assignment order and r_cut doesn't handle the case where r_cut becomes bigger than half the local box properly.
Unfortunately, this code is really hard to understand.
Rather than digging further, we should probably rewrite this in a more functional style and clearly write down the pre- and postconditions for each step.
While we are at it, we can introduce the possibility to inject artificial timings, so the entire thing can be tested.
And include skin tuning for that matter.

@pkreissl, @pstaerk, where/under what condition did the issue occur in the tutorial?

@KaiSzuttor KaiSzuttor added the Bug label Nov 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants