Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use sundials kinsol for diffusion dislocation creep #5698

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

bobmyhill
Copy link
Member

This PR replaces the handrolled Newton scheme in diffusion-dislocation creep with KINSOL routines.

@bobmyhill
Copy link
Member Author

/rebuild

@bobmyhill
Copy link
Member Author

@bangerth, would you mind having a look at this? Total average runtime for tests/diffusion_dislocation_fixed_strain_rate.prm is about 10% slower than with our old handrolled Newton solver, so presumably the internal iterations are much slower.

@bangerth
Copy link
Contributor

bangerth commented Jun 1, 2024

Yes, will look at this. This is exciting!

Comment on lines -95 to -96
while (std::abs(log_strain_rate_residual) > log_strain_rate_residual_threshold
&& stress_iteration < stress_max_iteration_number)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is your current stopping criterion. Could you check how many iterations you are using in this loop (in total or on average) in one of your benchmarks, and compare that against the number of iterations you're spending in KINSOLS? I think that might clarify why the code is so much slower.

You can also directly time individual sections by querying simulator_access.get_computing_timer(). You can take what's in particles/world.cc as an example of how you can put a timing section around a piece of code.

source/material_model/rheology/diffusion_dislocation.cc Outdated Show resolved Hide resolved
source/material_model/rheology/diffusion_dislocation.cc Outdated Show resolved Hide resolved
source/material_model/rheology/diffusion_dislocation.cc Outdated Show resolved Hide resolved
source/material_model/rheology/diffusion_dislocation.cc Outdated Show resolved Hide resolved
source/material_model/rheology/diffusion_dislocation.cc Outdated Show resolved Hide resolved
Comment on lines -16 to +18
Solving Stokes system... 16+0 iterations.
Relative nonlinear residuals (temperature, compositional fields, Stokes system): 1.68721e-16, 1.05609e-16, 0.000715698
Relative nonlinear residual (total system) after nonlinear iteration 2: 0.000715698
Solving Stokes system... 21+0 iterations.
Relative nonlinear residuals (temperature, compositional fields, Stokes system): 1.68721e-16, 1.05609e-16, 0.000716824
Relative nonlinear residual (total system) after nonlinear iteration 2: 0.000716824
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bummer that (at the moment) you need more iterations, but a good sign that the results are essentially identical.

@bobmyhill
Copy link
Member Author

@bangerth As requested, I've created a little standalone file by modifying step-77.cc. It exhibits the same behaviour as seen in traction_ascii_data.prm on this branch.
step-77.txt

@bangerth
Copy link
Contributor

bangerth commented Jun 2, 2024

At the very least, I can confirm the issue with the two nearby points:

     -176.46700446 -69.77069997 1
     -176.4670055 -69.77070101 1
     203.06842599 106.69630449 3.5
     203.06843156 106.69630608 3.5
     1.4210854715e-14 48.676754204 3.5
5

@bangerth
Copy link
Contributor

bangerth commented Jun 2, 2024

For reference, in the first call, this is the call trace:

#8  0x00007fffd03caafc in KINSolInit () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5
#9  0x00007fffd03c7f6e in KINSol () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5

whereas in the second (with the almost identical x value in the second column) I have

#8  0x00007fffd03d4680 in kinLsDQJtimes () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5
#9  0x00007fffd03d38da in kinLsATimes () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5
#10 0x00007fffd03d50ed in kinLsSolve () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5
#11 0x00007fffd03cdea5 in KINPicardFcnEval () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5
#12 0x00007fffd03cda38 in KINPicardAA () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5
#13 0x00007fffd03c813d in KINSol () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5

The third call is from here:

#8  0x00007fffd03cdb23 in KINPicardAA () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5
#9  0x00007fffd03c813d in KINSol () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5

Call number 4 is again from here:

#8  0x00007fffd03d4680 in kinLsDQJtimes () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5
#9  0x00007fffd03d38da in kinLsATimes () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5
#10 0x00007fffd03d50ed in kinLsSolve () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5
#11 0x00007fffd03cdea5 in KINPicardFcnEval () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5
#12 0x00007fffd03cda38 in KINPicardAA () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5
#13 0x00007fffd03c813d in KINSol () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5

And the last call is from

#8  0x00007fffd03cdb23 in KINPicardAA () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5
#9  0x00007fffd03c813d in KINSol () from /home/bangerth/bin/candi-install/sundials-5.7.0/lib/libsundials_kinsol.so.5

I will have to install the source code of SUNDIALS to make sense of this.

@bangerth
Copy link
Contributor

bangerth commented Jun 3, 2024

I decided to ask for help from my SUNDIALS friends at LLNL/sundials#500.

@bangerth
Copy link
Contributor

bangerth commented Jun 3, 2024

For completeness, this is the minimal testcase:

#include <deal.II/sundials/kinsol.h>


const double pressure = 59448242.437;
const double temperature = 327.2685405;
const double log_edot_ii = -40.053535387;
const double grain_size = 0.001;

const double gas_constant = 8.31446;

using namespace dealii;

namespace DiffusionCreepParameters
{
  const double prefactor = 1.5e-16;
  const double grain_size_exponent = 4;
  const double stress_exponent = 1;
  const double activation_energy = 375000;
  const double activation_volume = 6e-06;
};

namespace DislocationCreepParameters
{
  const double prefactor = 2e-15;
  const double stress_exponent = 3.5;
  const double activation_energy = 480000;
  const double activation_volume = 8e-06;
};


std::pair<double, double>
compute_diffusion_log_strain_rate_and_derivative (const double log_stress,
                                                  const double pressure,
                                                  const double temperature)
{
  const double log_strain_rate_diffusion = std::log(DiffusionCreepParameters::prefactor) +
                                           log_stress -
                                           DiffusionCreepParameters::grain_size_exponent * std::log(grain_size) -
                                           (DiffusionCreepParameters::activation_energy + pressure*DiffusionCreepParameters::activation_volume)/
                                           (gas_constant*temperature);

  const double dlog_strain_rate_dlog_stress_diffusion = 1.0;

  return std::make_pair(log_strain_rate_diffusion, dlog_strain_rate_dlog_stress_diffusion);
}



std::pair<double, double>
compute_dislocation_log_strain_rate_and_derivative (const double log_stress,
                                                    const double pressure,
                                                    const double temperature)
{
  const double log_strain_rate_dislocation = std::log(DislocationCreepParameters::prefactor) +
                                             DislocationCreepParameters::stress_exponent * log_stress -
                                             (DislocationCreepParameters::activation_energy + pressure*DislocationCreepParameters::activation_volume)/
                                             (gas_constant*temperature);

  const double dlog_strain_rate_dlog_stress_dislocation = DislocationCreepParameters::stress_exponent;

  return std::make_pair(log_strain_rate_dislocation, dlog_strain_rate_dlog_stress_dislocation);
}



std::pair<double, double>
compute_log_strain_rate_residual_and_derivative(const double current_log_stress_ii,
                                                const double pressure,
                                                const double temperature,
                                                const double log_edot_ii)
{
  const std::pair<double, double> log_diff_edot_and_deriv = compute_diffusion_log_strain_rate_and_derivative(current_log_stress_ii, pressure, temperature);
  const std::pair<double, double> log_disl_edot_and_deriv = compute_dislocation_log_strain_rate_and_derivative(current_log_stress_ii, pressure, temperature);

  const double strain_rate_diffusion = std::exp(log_diff_edot_and_deriv.first);
  const double strain_rate_dislocation = std::exp(log_disl_edot_and_deriv.first);
  double log_strain_rate_deriv = (strain_rate_diffusion * log_diff_edot_and_deriv.second + strain_rate_dislocation * log_disl_edot_and_deriv.second)/
                                 (strain_rate_diffusion + strain_rate_dislocation);
  const double log_strain_rate_iterate = std::log(strain_rate_diffusion + strain_rate_dislocation);
  return std::make_pair(log_strain_rate_iterate - log_edot_ii, log_strain_rate_deriv);
}



int main()
{
  const double maximum_viscosity = 1.e30;
  double log_strain_rate_deriv;

  // For diffusion creep, viscosity is grain size dependent
  const double prefactor_stress_diffusion = DiffusionCreepParameters::prefactor *
                                            std::pow(grain_size, -DiffusionCreepParameters::grain_size_exponent) *
                                            std::exp(-(std::max(DiffusionCreepParameters::activation_energy + pressure * DiffusionCreepParameters::activation_volume, 0.0)) /
                                                     (gas_constant * temperature));

  SUNDIALS::KINSOL<Vector<double>>::AdditionalData additional_data;
  additional_data.strategy = dealii::SUNDIALS::KINSOL<>::AdditionalData::newton;
  additional_data.function_tolerance = 1e-10;
  additional_data.maximum_non_linear_iterations = 200;
  additional_data.maximum_setup_calls = 10;

  double n_residual_evaluations = 0;
  SUNDIALS::KINSOL<Vector<double>> nonlinear_solver(additional_data);

  
  nonlinear_solver.reinit_vector = [&](Vector<double> &x)
    {
      x.reinit(1);
    };

  
  nonlinear_solver.residual =
    [&](const Vector<double> &current_log_stress_ii,
        Vector<double> &residual)
      {
        std::tie(residual(0), log_strain_rate_deriv) = compute_log_strain_rate_residual_and_derivative(current_log_stress_ii[0], pressure, temperature, log_edot_ii);

        std::cout << std::setprecision(11) << "     Computing residual at x=" << current_log_stress_ii[0] << ", f(x)=" << residual(0) << std::endl;
        n_residual_evaluations += 1;
      };

  
  nonlinear_solver.setup_jacobian =
    [&](const Vector<double> & current_log_stress_ii,
        const Vector<double> & /*current_f*/)
      {
        // Do nothing here, because we calculate the Jacobian in the residual function
        std::cout << "     Recomputing J at x=" << current_log_stress_ii[0] << std::endl;
      };

  
  nonlinear_solver.solve_with_jacobian = [&](const Vector<double> &residual,
                                             Vector<double> &solution,
                                             const double /*tolerance*/)
    {
      std::cout << "     Solving for dx with residual=" << residual[0] << std::endl;
          
      solution(0) = residual(0) / log_strain_rate_deriv;
    };



  // Start with the assumption that all strain is accommodated by diffusion creep:
  // If the diffusion creep prefactor is very small, that means that the diffusion viscosity is very large.
  // In this case, use the maximum viscosity instead to compute the starting guess.
  const double stress_ii = (prefactor_stress_diffusion > (0.5 / maximum_viscosity)
                            ? std::exp(log_edot_ii) / prefactor_stress_diffusion
                            : 0.5 / maximum_viscosity);

  // KINSOL works on vectors and so the scalar (log) stress is inserted into
  // a vector of length 1
  Vector<double> log_stress_ii(1);
  log_stress_ii[0] = std::log(stress_ii);

  nonlinear_solver.solve(log_stress_ii);
  std::cout << n_residual_evaluations << " residual evaluations" << std::endl;
}

@bangerth
Copy link
Contributor

bangerth commented Jun 4, 2024

@bobmyhill Is it expected that in your example, log_strain_rate_deriv happens to be exactly 1.0 in the first iteration, and exactly 3.5 in the next? These are values too nice to be coincidences.

@bobmyhill
Copy link
Member Author

Hi @bangerth, yes, the function is almost piecewise linear (with a smoothed kink).

@bobmyhill
Copy link
Member Author

@bangerth I guess this is ready from my side. I think the slowdown is probably just the number of iterations for this particular case.

One thought - would it be possible / worth it to move any of the KINSOL setup out of calculate_isostrain_viscosities? additional_data could be moved out, but I'm not sure anything else can.

Copy link
Member

@gassmoeller gassmoeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what will happen with this PR, but just in case I think I found a case of unnecessary copies in the residual function.

Comment on lines 145 to 146
const Rheology::DiffusionCreepParameters diffusion_creep_parameters,
const Rheology::DislocationCreepParameters dislocation_creep_parameters,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want to hand over these two structures by reference to avoid the copies.

Copy link
Contributor

@bangerth bangerth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some (small) comments.

Comment on lines 83 to 88
// We use KINSOL to find the second invariant of the stress tensor that satisfies the total strain rate.
SUNDIALS::KINSOL<Vector<double>>::AdditionalData additional_data;
additional_data.strategy = dealii::SUNDIALS::KINSOL<>::AdditionalData::newton;
additional_data.function_tolerance = log_strain_rate_residual_threshold;
additional_data.maximum_non_linear_iterations = stress_max_iteration_number;
additional_data.maximum_setup_calls = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can definitely move this out of the loop over the chemical composition fields.

@@ -54,10 +55,10 @@ namespace aspect
const double edot_ii = std::max(std::sqrt(std::max(-second_invariant(deviator(strain_rate)), 0.)),
min_strain_rate);
const double log_edot_ii = std::log(edot_ii);

double log_strain_rate_deriv;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put that declaration right above the lambda functions that use it to minimize the scope of the variable.

// KINSOL works on vectors and so the scalar (log) stress is inserted into
// a vector of length 1
Vector<double> log_stress_ii(1);
log_stress_ii[0] = std::log(stress_ii);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is your starting guess. How is it chosen? Based on the current log stress? It might be worth trying the output of the previous loop iteration, under the assumption that what you found in the previous iteration is a good starting guess for the current iteration. Not sure whether that's true, since you're not looping over quadrature points, for example, but a thought.

[&](const Vector<double> &current_log_stress_ii,
Vector<double> &residual)
{
std::tie(residual(0), log_strain_rate_deriv) = compute_log_strain_rate_residual_and_derivative(current_log_stress_ii[0], pressure, temperature, diffusion_creep_parameters, dislocation_creep_parameters, log_edot_ii);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe break this line somewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants