Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESPResSo benchmark tolerances need adjustments #161

Open
jngrad opened this issue Jun 28, 2024 · 0 comments
Open

ESPResSo benchmark tolerances need adjustments #161

jngrad opened this issue Jun 28, 2024 · 0 comments

Comments

@jngrad
Copy link

jngrad commented Jun 28, 2024

The ESPResSo benchmarks measure physical quantities, such as pressures and energies, and compare them against reference values. Some algorithms like the FFT involve a large number of floating-point operations that inevitably lead to precision loss. On x86 architectures that implement extended precision format, ESPResSo benefits from the 80bit representation when computing reductions involving a long sequence of basic arithmetic operations (trilinear interpolation, Taylor series) or exponentials: intermediate values are stored in 80bit wide registers, and only get truncated to 64bit wide floats when pushed to the stack or heap memory, typically at the end of the calculation.

The ESPResSo testsuite uses different tolerances based on whether the algorithm is implemented for 64bit or 32bit floating-point values. Typically the latter is used when offloading to the GPU. However, we do not have a mechanism in place to detect whether the FPU uses 80bit or 64bit wide registers. For this reason, the ESPResSo team needs to periodically adjust tolerances by running the testsuite on RISC architectures, which don't have 80bit wide registers.

The recently merged P3M ionic crystal benchmark for ESPResSo revealed relatively large deviations from the expected solutions. While the calculated energy of the crystal is correct, its relative deviation from the reference value is 200 times larger on Deucalion's ARMv8.2-A (Fujitsu A64FX) compared to Vega's Zen2 (AMD EPYC Rome 7H12). While we could update the test tolerance accordingly, this would prevent us from detecting unexpected accuracy losses on hardware where ESPResSo is known to leverage the extended precision format.

If there is a portable way of detecting hardware precision from the Python interface, we could tailor the test tolerances using a table. This might not be trivial, because precision also depends on compiler flags, such as architecture-dependent optimization and "fast math" optimizations. In addition, when using long-range solvers like P3M, the number of mathematical operations increases with the system size, which affects precision loss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant