Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cavity function used directly instead of projected first #458

Merged
merged 18 commits into from
Nov 2, 2023

Conversation

gitpeterwind
Copy link
Member

@gitpeterwind gitpeterwind commented Jul 24, 2023

Avoids memory consuming function trees, by using directly the analytical function.
Note: requires the latest version of MRCPP
Gabriel will test/polish before merge into master

@gitpeterwind gitpeterwind added the WIP Work in progress label Jul 24, 2023
@robertodr
Copy link
Contributor

Thanks Peter!

@gitpeterwind gitpeterwind removed the WIP Work in progress label Oct 22, 2023
@gitpeterwind
Copy link
Member Author

How was it that we force it to use the latest mrcpp?

src/driver.cpp Outdated Show resolved Hide resolved
src/environment/SCRF.cpp Outdated Show resolved Hide resolved
src/environment/SCRF.cpp Outdated Show resolved Hide resolved
Density rho_tot(false);
computeDensities(Phi, rho_tot);

mrcpp::cplxfunc::multiply(first_term, rho_tot, this->epsilon, this->apply_prec);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's enough to use 1.0 / this->epsilon to avoid using flipFunction?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not work off the bat. I could do an operator overload of the / operator for the permittivity function, but that would just have the same function as the flipFunction method and would not necessarily work here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it would serve the same purpose, but it's at least clearer what is happening. I'm not the only one who's been confused by flipFunction(true) and flipFunction(false) :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the conclusion here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah, I think since this wont save any memory (as far as I can see) and is meant more for ease of understanding the code flow, it can probably be incorporated in another PR.

src/environment/SCRF.cpp Outdated Show resolved Hide resolved
src/environment/SCRF.cpp Outdated Show resolved Hide resolved
src/environment/SCRF.cpp Outdated Show resolved Hide resolved
src/environment/SCRF.h Outdated Show resolved Hide resolved
src/environment/SCRF.h Outdated Show resolved Hide resolved
src/environment/SCRF.h Outdated Show resolved Hide resolved
src/environment/SCRF.h Outdated Show resolved Hide resolved
Copy link
Contributor

@robertodr robertodr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @gitpeterwind! You made a lot of simplifications that I've been also making in #456 😄 For which system(s) have you been testing the memory footprint reductions? And what is the before and after consumption?

I've left some comments, especially about argument passing. Please have a look.

@robertodr
Copy link
Contributor

robertodr commented Oct 23, 2023

Note that the solvation tests fail now:

         49 - reaction_operator (Failed)
         60 - Li_solvent_effect (Failed)

The former is a unit test, the latter an integration test. For the latter it might be enough to regenerate the reference output:

60:     JSON['output']['properties']['scf_energy']['E_next']....................................................................PASSED
60:     JSON['output']['properties']['scf_energy']['E_el']: computed value (-7.09457290) does not match (-7.09458382) to atol=1e-06, rtol=1e-06 by difference (0.00001092).
60:     JSON['output']['properties']['scf_energy']['Er_tot']: computed value (-0.06409434) does not match (-0.06408885) to atol=1e-06, rtol=1e-06 by difference (-0.00000550).
60:     JSON['output']['properties']['scf_energy']['Er_el']: computed value (0.12807619) does not match (0.12806527) to atol=1e-06, rtol=1e-06 by difference (0.00001092).
60:     JSON['output']['properties']['scf_energy']['Er_nuc']: computed value (-0.19217053) does not match (-0.19215412) to atol=1e-06, rtol=1e-06 by difference (-0.00001641).

@gitpeterwind
Copy link
Member Author

This looks great @gitpeterwind! You made a lot of simplifications that I've been also making in #456 😄 For which system(s) have you been testing the memory footprint reductions? And what is the before and after consumption?

I've left some comments, especially about argument passing. Please have a look.

Actually @Gabrielgerez did most of the work. I mostly provided the function to multiply directly the orbitals with a representable function. (PS: I am in Oslo, and busy this week)

src/environment/SCRF.cpp Outdated Show resolved Hide resolved
@Gabrielgerez
Copy link
Contributor

@robertodr @gitpeterwind Now I have fixed what was needed and updated the input parsing (since i removed a keyword) and tests. All tests should be passing now.
Sorry for the wait as well, attempted a big refactor to try and shave off the last few gb and it took longer than expected, so i had to branch it off

@codecov
Copy link

codecov bot commented Oct 30, 2023

Codecov Report

Attention: 12 lines in your changes are missing coverage. Please review.

Comparison is base (70e9fbd) 69.58% compared to head (516cc21) 69.25%.
Report is 20 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #458      +/-   ##
==========================================
- Coverage   69.58%   69.25%   -0.34%     
==========================================
  Files         180      179       -1     
  Lines       15027    14975      -52     
==========================================
- Hits        10457    10371      -86     
- Misses       4570     4604      +34     
Files Coverage Δ
src/driver.cpp 72.68% <100.00%> (-0.04%) ⬇️
src/initial_guess/core.cpp 99.21% <100.00%> (ø)
src/initial_guess/cube.cpp 95.94% <100.00%> (ø)
src/initial_guess/gto.cpp 100.00% <100.00%> (ø)
src/initial_guess/sad.cpp 98.92% <100.00%> (-1.08%) ⬇️
src/qmoperators/two_electron/FockBuilder.cpp 96.96% <100.00%> (+0.04%) ⬆️
src/qmoperators/two_electron/ReactionOperator.h 85.71% <ø> (-3.18%) ⬇️
src/qmoperators/two_electron/ReactionPotential.h 66.66% <ø> (-13.34%) ⬇️
tests/solventeffect/reaction_operator.cpp 100.00% <100.00%> (ø)
src/environment/SCRF.cpp 95.18% <96.42%> (+1.96%) ⬆️
... and 2 more

... and 11 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/environment/SCRF.cpp Outdated Show resolved Hide resolved
src/environment/SCRF.h Outdated Show resolved Hide resolved
Density rho_ext;
Density rho_tot;
Density rho_nuc; // As of right now, this is the biggest memory hog.
// Alternative could be to precompute its contributions, as a potential is not as heavy as a density (maybe)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think potentials are heavier than densities, as they are less local (@gitpeterwind correct me if I'm wrong) Also, the nuclei are not point charges, so the analytical form of the potential either doesn't exist in closed form (e.g. Gaussian charge densities will have the error function as potential) or it's not trivial to represent. It could be possible to represent the density analytically though... Other options:

  • not store rho_nuc inside the class and have it passed from the SCF object at each iteration,
  • compute Vr_n once (during initialization or at the first iteration) and only store that in this class. Memory-wise though this is probably more expensive...

For linear response this part of the density (and corresponding reaction potential) is not needed at all, so if we figure out a nice way to not store it, those kinds of calculations would be feasible for larger systems.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think potentials are heavier than densities, as they are less local

It is more the smoothness than the extension that matters. I think in practice potentials are smaller (less MWnodes) than density normally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So storing Vr_nuc only could lead to further savings?

Copy link
Contributor

@robertodr robertodr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but there's two comments @Gabrielgerez should chime in on before this can be merged. Also:

  • One curiosity: How are you measuring the memory consumption?
  • What are the final before-and-after ballpark numbers on how much memory is needed for a solvent calculation?

@Gabrielgerez
Copy link
Contributor

LGTM but there's two comments @Gabrielgerez should chime in on before this can be merged. Also:

  • One curiosity: How are you measuring the memory consumption?
  • What are the final before-and-after ballpark numbers on how much memory is needed for a solvent calculation?

Im using the log file output from slurm in betzy an example for a HF water calculation before and after:
before

Memory statistics, in GiB:
ID             Alloc   Usage
732516         968.0        
732516.batch   242.0    20.0
732516.orted   726.0    57.5

after:

Memory statistics, in GiB:
ID             Alloc   Usage
732519         968.0        
732519.batch   242.0     9.1
732519.orted   726.0    22.7

When testing we were using a PBE0 calculation, where the gain showed a bit more, but now I am doing some proper scaling calculations with different sizes.

@Gabrielgerez
Copy link
Contributor

LGTM but there's two comments @Gabrielgerez should chime in on before this can be merged. Also:

  • One curiosity: How are you measuring the memory consumption?
  • What are the final before-and-after ballpark numbers on how much memory is needed for a solvent calculation?

For the memory necessary for a solvent calculation, again an example of HF water
gas phase

Memory statistics, in GiB:
ID             Alloc   Usage
732514         968.0        
732514.batch   242.0     6.9
732514.orted   726.0    12.9

solvent (eps=2.0, default for everything else)

Memory statistics, in GiB:
ID             Alloc   Usage
732519         968.0        
732519.batch   242.0     9.1
732519.orted   726.0    22.7

@robertodr
Copy link
Contributor

Looks very good! What is the difference between batch and orted rows?

@Gabrielgerez
Copy link
Contributor

Looks very good! What is the difference between batch and orted rows?

I was hoping @gitpeterwind could answer, I am assuming one is the memory used by the main process while the other is the shared one. The machine docs do not have much documentation on this, that I could easily find

@gitpeterwind
Copy link
Member Author

gitpeterwind commented Nov 2, 2023

Looks very good! What is the difference between batch and orted rows?

"Batch" is for the master compute node, while "orted" (Open Runtime Environment Daemon) is for the sum of all others compute nodes. Your run on 4 compute nodes , each of them has 242 GB available (3*242=726).

@robertodr
Copy link
Contributor

So the sum of the two is the estimate of the memory consumption, right? If yes, the cost of the solvent went from ~4x to ~1.6x the cost of the vacuum calculation. Impressive!

@gitpeterwind
Copy link
Member Author

gitpeterwind commented Nov 2, 2023

So the sum of the two is the estimate of the memory consumption, right?

Yes, but the most relevant is the memory usage of the most memory consuming compute-node (normally master): it does not help to have compute nodes that consume little memory, if one of the compute nodes needs more memory than available. (compute nodes can not use memory from other compute nodes)

@robertodr robertodr merged commit f0d0560 into MRChemSoft:master Nov 2, 2023
10 checks passed
gitpeterwind added a commit to gitpeterwind/mrchem that referenced this pull request Aug 5, 2024
)

* cavity function used directly instead of projected first

* clear d_V

* Update SCRF.cpp 

define ranks in accelerator, so that all mpi ranks use kain

* cavity function used directly instead of projected first

* clear d_V

* Update SCRF.cpp 

define ranks in accelerator, so that all mpi ranks use kain

* manage to pull down to 12.2 and 30.0 gb

* reduced memory usage to ca. 3 gb from 8 gb

* Remove print statement for memory

* Small changes to desctructor and clear

* Use latest MRCPP

* Use default initialization in header file

* Do some cleaning after feedback

* Remove ´optimizer´ option from inputs

* Fix tests

---------

Co-authored-by: Roberto Di Remigio Eikås <[email protected]>
Co-authored-by: Gabrielgerez <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants