cavity function used directly instead of projected first #458

gitpeterwind · 2023-07-24T12:40:57Z

Avoids memory consuming function trees, by using directly the analytical function.
Note: requires the latest version of MRCPP
Gabriel will test/polish before merge into master

robertodr · 2023-07-25T06:40:54Z

Thanks Peter!

define ranks in accelerator, so that all mpi ranks use kain

Rebases and cleaning up

gitpeterwind · 2023-10-22T18:45:37Z

How was it that we force it to use the latest mrcpp?

src/driver.cpp

src/environment/SCRF.cpp

robertodr · 2023-10-23T06:35:19Z

src/environment/SCRF.cpp

+    Density rho_tot(false);
+    computeDensities(Phi, rho_tot);
+
+    mrcpp::cplxfunc::multiply(first_term, rho_tot, this->epsilon, this->apply_prec);


Maybe it's enough to use 1.0 / this->epsilon to avoid using flipFunction?

It does not work off the bat. I could do an operator overload of the / operator for the permittivity function, but that would just have the same function as the flipFunction method and would not necessarily work here.

I agree it would serve the same purpose, but it's at least clearer what is happening. I'm not the only one who's been confused by flipFunction(true) and flipFunction(false) :)

What's the conclusion here?

oh yeah, I think since this wont save any memory (as far as I can see) and is meant more for ease of understanding the code flow, it can probably be incorporated in another PR.

src/environment/SCRF.cpp

src/environment/SCRF.h

src/qmoperators/two_electron/FockBuilder.cpp

robertodr

This looks great @gitpeterwind! You made a lot of simplifications that I've been also making in #456 😄 For which system(s) have you been testing the memory footprint reductions? And what is the before and after consumption?

I've left some comments, especially about argument passing. Please have a look.

robertodr · 2023-10-23T06:54:36Z

Note that the solvation tests fail now:

         49 - reaction_operator (Failed)
         60 - Li_solvent_effect (Failed)

The former is a unit test, the latter an integration test. For the latter it might be enough to regenerate the reference output:

60:     JSON['output']['properties']['scf_energy']['E_next']....................................................................PASSED
60:     JSON['output']['properties']['scf_energy']['E_el']: computed value (-7.09457290) does not match (-7.09458382) to atol=1e-06, rtol=1e-06 by difference (0.00001092).
60:     JSON['output']['properties']['scf_energy']['Er_tot']: computed value (-0.06409434) does not match (-0.06408885) to atol=1e-06, rtol=1e-06 by difference (-0.00000550).
60:     JSON['output']['properties']['scf_energy']['Er_el']: computed value (0.12807619) does not match (0.12806527) to atol=1e-06, rtol=1e-06 by difference (0.00001092).
60:     JSON['output']['properties']['scf_energy']['Er_nuc']: computed value (-0.19217053) does not match (-0.19215412) to atol=1e-06, rtol=1e-06 by difference (-0.00001641).

gitpeterwind · 2023-10-23T07:57:42Z

This looks great @gitpeterwind! You made a lot of simplifications that I've been also making in #456 😄 For which system(s) have you been testing the memory footprint reductions? And what is the before and after consumption?

I've left some comments, especially about argument passing. Please have a look.

Actually @Gabrielgerez did most of the work. I mostly provided the function to multiply directly the orbitals with a representable function. (PS: I am in Oslo, and busy this week)

src/environment/SCRF.cpp

Gabrielgerez · 2023-10-30T15:19:51Z

@robertodr @gitpeterwind Now I have fixed what was needed and updated the input parsing (since i removed a keyword) and tests. All tests should be passing now.
Sorry for the wait as well, attempted a big refactor to try and shave off the last few gb and it took longer than expected, so i had to branch it off

codecov · 2023-10-30T15:52:41Z

Codecov Report

Attention: 12 lines in your changes are missing coverage. Please review.

Comparison is base (70e9fbd) 69.58% compared to head (516cc21) 69.25%.
Report is 20 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #458      +/-   ##
==========================================
- Coverage   69.58%   69.25%   -0.34%     
==========================================
  Files         180      179       -1     
  Lines       15027    14975      -52     
==========================================
- Hits        10457    10371      -86     
- Misses       4570     4604      +34

Files	Coverage Δ
src/driver.cpp	`72.68% <100.00%> (-0.04%)`	⬇️
src/initial_guess/core.cpp	`99.21% <100.00%> (ø)`
src/initial_guess/cube.cpp	`95.94% <100.00%> (ø)`
src/initial_guess/gto.cpp	`100.00% <100.00%> (ø)`
src/initial_guess/sad.cpp	`98.92% <100.00%> (-1.08%)`	⬇️
src/qmoperators/two_electron/FockBuilder.cpp	`96.96% <100.00%> (+0.04%)`	⬆️
src/qmoperators/two_electron/ReactionOperator.h	`85.71% <ø> (-3.18%)`	⬇️
src/qmoperators/two_electron/ReactionPotential.h	`66.66% <ø> (-13.34%)`	⬇️
tests/solventeffect/reaction_operator.cpp	`100.00% <100.00%> (ø)`
src/environment/SCRF.cpp	`95.18% <96.42%> (+1.96%)`	⬆️
... and 2 more

... and 11 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/environment/SCRF.cpp

src/environment/SCRF.h

robertodr · 2023-11-01T09:53:44Z

src/environment/SCRF.h

-    Density rho_ext;
-    Density rho_tot;
+    Density rho_nuc; // As of right now, this is the biggest memory hog.
+    // Alternative could be to precompute its contributions, as a potential is not as heavy as a density (maybe)


I think potentials are heavier than densities, as they are less local (@gitpeterwind correct me if I'm wrong) Also, the nuclei are not point charges, so the analytical form of the potential either doesn't exist in closed form (e.g. Gaussian charge densities will have the error function as potential) or it's not trivial to represent. It could be possible to represent the density analytically though... Other options:

not store rho_nuc inside the class and have it passed from the SCF object at each iteration,

compute Vr_n once (during initialization or at the first iteration) and only store that in this class. Memory-wise though this is probably more expensive...

For linear response this part of the density (and corresponding reaction potential) is not needed at all, so if we figure out a nice way to not store it, those kinds of calculations would be feasible for larger systems.

I think potentials are heavier than densities, as they are less local

It is more the smoothness than the extension that matters. I think in practice potentials are smaller (less MWnodes) than density normally.

So storing Vr_nuc only could lead to further savings?

robertodr

LGTM but there's two comments @Gabrielgerez should chime in on before this can be merged. Also:

One curiosity: How are you measuring the memory consumption?
What are the final before-and-after ballpark numbers on how much memory is needed for a solvent calculation?

Gabrielgerez · 2023-11-01T15:16:54Z

LGTM but there's two comments @Gabrielgerez should chime in on before this can be merged. Also:

One curiosity: How are you measuring the memory consumption?

What are the final before-and-after ballpark numbers on how much memory is needed for a solvent calculation?

Im using the log file output from slurm in betzy an example for a HF water calculation before and after:
before

Memory statistics, in GiB:
ID             Alloc   Usage
732516         968.0        
732516.batch   242.0    20.0
732516.orted   726.0    57.5

after:

Memory statistics, in GiB:
ID             Alloc   Usage
732519         968.0        
732519.batch   242.0     9.1
732519.orted   726.0    22.7

When testing we were using a PBE0 calculation, where the gain showed a bit more, but now I am doing some proper scaling calculations with different sizes.

Gabrielgerez · 2023-11-01T15:32:07Z

LGTM but there's two comments @Gabrielgerez should chime in on before this can be merged. Also:

One curiosity: How are you measuring the memory consumption?

What are the final before-and-after ballpark numbers on how much memory is needed for a solvent calculation?

For the memory necessary for a solvent calculation, again an example of HF water
gas phase

Memory statistics, in GiB:
ID             Alloc   Usage
732514         968.0        
732514.batch   242.0     6.9
732514.orted   726.0    12.9

solvent (eps=2.0, default for everything else)

Memory statistics, in GiB:
ID             Alloc   Usage
732519         968.0        
732519.batch   242.0     9.1
732519.orted   726.0    22.7

robertodr · 2023-11-01T17:54:29Z

Looks very good! What is the difference between batch and orted rows?

Gabrielgerez · 2023-11-02T09:36:22Z

Looks very good! What is the difference between batch and orted rows?

I was hoping @gitpeterwind could answer, I am assuming one is the memory used by the main process while the other is the shared one. The machine docs do not have much documentation on this, that I could easily find

gitpeterwind · 2023-11-02T09:50:14Z

Looks very good! What is the difference between batch and orted rows?

"Batch" is for the master compute node, while "orted" (Open Runtime Environment Daemon) is for the sum of all others compute nodes. Your run on 4 compute nodes , each of them has 242 GB available (3*242=726).

robertodr · 2023-11-02T10:11:06Z

So the sum of the two is the estimate of the memory consumption, right? If yes, the cost of the solvent went from ~4x to ~1.6x the cost of the vacuum calculation. Impressive!

gitpeterwind · 2023-11-02T10:21:15Z

So the sum of the two is the estimate of the memory consumption, right?

Yes, but the most relevant is the memory usage of the most memory consuming compute-node (normally master): it does not help to have compute nodes that consume little memory, if one of the compute nodes needs more memory than available. (compute nodes can not use memory from other compute nodes)

) * cavity function used directly instead of projected first * clear d_V * Update SCRF.cpp define ranks in accelerator, so that all mpi ranks use kain * cavity function used directly instead of projected first * clear d_V * Update SCRF.cpp define ranks in accelerator, so that all mpi ranks use kain * manage to pull down to 12.2 and 30.0 gb * reduced memory usage to ca. 3 gb from 8 gb * Remove print statement for memory * Small changes to desctructor and clear * Use latest MRCPP * Use default initialization in header file * Do some cleaning after feedback * Remove ´optimizer´ option from inputs * Fix tests --------- Co-authored-by: Roberto Di Remigio Eikås <[email protected]> Co-authored-by: Gabrielgerez <[email protected]>

cavity function used directly instead of projected first

02cb7ea

gitpeterwind added the WIP Work in progress label Jul 24, 2023

gitpeterwind requested a review from Gabrielgerez July 24, 2023 12:41

Gabrielgerez mentioned this pull request Sep 20, 2023

Running solvent in MPI doesn't change the update, and cant converge #462

Open

gitpeterwind and others added 11 commits October 13, 2023 10:46

clear d_V

0fd90c5

Update SCRF.cpp

0e04bcf

define ranks in accelerator, so that all mpi ranks use kain

cavity function used directly instead of projected first

9c0c0b9

clear d_V

c1af3c5

Update SCRF.cpp

b866818

define ranks in accelerator, so that all mpi ranks use kain

manage to pull down to 12.2 and 30.0 gb

ffe9daf

reduced memory usage to ca. 3 gb from 8 gb

2f27ea5

Remove print statement for memory

ec49637

Small changes to desctructor and clear

3990672

Merge branch 'vecmult' into vecmult

76baa70

Merge pull request #3 from Gabrielgerez/vecmult

a160fe2

Rebases and cleaning up

gitpeterwind removed the WIP Work in progress label Oct 22, 2023

gitpeterwind requested review from stigrj and robertodr October 22, 2023 18:44

Use latest MRCPP

8c9083a