Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Observables: Particle selection and filtering should be independent of observable #3154

Closed
fweik opened this issue Sep 11, 2019 · 7 comments · Fixed by #4748
Closed

Observables: Particle selection and filtering should be independent of observable #3154

fweik opened this issue Sep 11, 2019 · 7 comments · Fixed by #4748

Comments

@fweik
Copy link
Contributor

fweik commented Sep 11, 2019

For particle observables the selection and filtering of the particles should be independent
of what is calculated. These are those currently derived from Obsevables::PidObservable,
and are already a separate class hierarchy. One plan of action could be to change the the
signature of virtual std::vector<double> Obsevables::PidObservable::evaluate(PartCfg &partCfg) const to virtual std::vector<double> Obsevables::PidObservable::evaluate(Utils::Span<const Particle *>) const, and evaluate the ids in the base class. Further, the actual implementation should potentially be a member and not be derived. This allows e.g. filtering out virtual particles in a central place and reuse the implementation of the observable function with other ways of selecting particles.

@fweik fweik added this to the Espresso 4.2 milestone Dec 10, 2019
kodiakhq bot added a commit that referenced this issue Jan 5, 2020
First part of #3154.

Description of changes:
 - Move id resolution to the base class of the particle observables
@RudolfWeeber RudolfWeeber removed this from the Espresso 4.2 milestone Oct 24, 2020
@jngrad
Copy link
Member

jngrad commented Aug 9, 2022

As far as I understand it, this ticket was addressed in the 4.2.0 release. Particle selection is now carried out in a central place and no longer depends on the partCfg global variable, code duplication in particle observables was removed via particle traits, and weighted sum algorithms were introduced to nullify contributions from virtual sites in derived statistics, e.g. in the center of mass observable. Closing.

@jngrad jngrad closed this as completed Aug 9, 2022
@RudolfWeeber
Copy link
Contributor

RudolfWeeber commented Aug 9, 2022 via email

@jngrad jngrad reopened this Aug 9, 2022
@jngrad
Copy link
Member

jngrad commented May 15, 2023

Here is a benchmark for a Lennard-Jones simulation that shows the observables framework is much slower than MPI-IO due to communication overhead, even though observables are RAM-only and MPI-IO write to disk:

Benchmarks for a Lennard-Jones simulation with three types of writer: parallel hdf5, MPI-IO, observables. For simulations with 8 cores or more, observables are twice as slow as MPI-IO, even though observables do not write to disk. Writing less frequently amortizes the slowdown, but MPI-IO remains more efficient than observables.

A bottleneck for particle-based observables is std::vector<Particle> fetch_particles(std::vector<int> const &ids) in src/particle_observables/include/particle_observables/observable.hpp, which communicates the entire particle structure from particle i from MPI rank j to MPI rank 0, then MPI rank 0 has to extract the trait from each particle copy (e.g. the particle force). Instead, we would like each MPI rank to extract the trait from the local particles and consolidate them in a local vector, then communicate that vector to a vector of vectors on MPI rank 0 . In addition, one would need to send an extra vector of particle ids, such that MPI rank 0 knows in which order the data arrived. Then one can re-order the particle data based on the data order and the input pids list, which isn't necessarily in ascending order. The communication can be achieved via an MPI gather operation using boost::mpi::gather().

@jngrad
Copy link
Member

jngrad commented May 15, 2023

Here is a MWE that illustrates how particle properties can be gathered and resorted on MPI rank 0.

MWE (click to unroll)
#include <boost/mpi/collectives/gather.hpp>
#include <boost/serialization/vector.hpp>

#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>

struct Particle{
  Particle (int id, double mass, double charge) : id(id), mass(mass), charge(charge) {
    boost::mpi::communicator comm;
    std::cout << "creating particle with id " << id << " on MPI rank " << comm.rank() << "\n";
  }
  int id;
  double mass;
  double charge;
};

class CellStructure {
  std::vector<Particle*> local_particles;

public:
  Particle *get_local_particle(int id) const {
    auto const index = static_cast<std::size_t>(id);
    if (index >= local_particles.size())
      return nullptr;
    return local_particles[id];
  }
  ~CellStructure() {
    for (auto const ptr : local_particles) {
      if (ptr) {
        delete ptr;
      }
    }
  }
  void create_particle(int id, double mass, double charge) {
    auto const index = static_cast<std::size_t>(id) + 1;
    if (index >= local_particles.size()) {
      local_particles.resize(index);
    }
    local_particles[id] = new Particle{id, mass, charge};
  }
};

static CellStructure cell_structure;

template <typename ParticleStruct>
struct ParticleTrait {
  double get_pid(ParticleStruct const &p) const { return p.id; }
  double get_mass(ParticleStruct const &p) const { return p.mass; }
  double get_charge(ParticleStruct const &p) const { return p.charge; }
};

void populate_particles() {
  boost::mpi::communicator comm;
  // create a system of particles, each MPI rank has a unique set of particles
  for (int i = 0, n_max = 2, rank = comm.rank(); i < n_max; ++i) {
    auto const unique_id = i + rank * n_max;
    auto const mass = 0.5 + static_cast<double>(unique_id);
    auto const charge = (i % 2 == 0) ? +1. : -1.;
    ::cell_structure.create_particle(unique_id, mass, charge);
  }
}

auto extract_particle_masses(std::vector<int> const &pids) {
  auto const destination_rank = 0;
  boost::mpi::communicator comm;
  ParticleTrait<Particle> trait_extractor{};
  std::vector<double> output;
  std::vector<double> local_traits;
  std::vector<int> local_pids;
  for (auto const pid : pids) {
    auto const ptr = ::cell_structure.get_local_particle(pid);
    if (ptr) {
      local_traits.emplace_back(trait_extractor.get_mass(*ptr));
      local_pids.emplace_back(trait_extractor.get_pid(*ptr));
    }
  }
  std::vector<std::vector<double>> global_traits;
  std::vector<std::vector<int>> global_pids;
  boost::mpi::gather(comm, local_traits, global_traits, destination_rank);
  boost::mpi::gather(comm, local_pids, global_pids, destination_rank);
  if (comm.rank() == destination_rank) {
    // flatten collected data
    std::vector<double> particle_traits;
    std::vector<int> particle_pids;
    for (auto const vec : global_traits) {
      for (auto const value : vec) {
        particle_traits.emplace_back(value);
      }
    }
    for (auto const vec : global_pids) {
      for (auto const value : vec) {
        particle_pids.emplace_back(value);
      }
    }
    // resort data
    auto const pid_begin = std::begin(particle_pids);
    auto const pid_end = std::end(particle_pids);
    for (auto const pid : pids) {
      auto const pid_pos = std::find(pid_begin, pid_end, pid);
      auto const i = static_cast<std::size_t>(std::distance(pid_begin, pid_pos));
      output.emplace_back(particle_traits[i]);
    }
  }
  return output;
}

void print_pids(std::vector<int> const &pids) {
  boost::mpi::communicator comm;
  if (comm.rank() == 0) {
    std::cout << "Looking for mass of particles with id: [";
    for (auto const value : pids) {
      std::cout << value << ", ";
    }
    std::cout << "\b\b]\n";
  }
}

void print_results(std::vector<double> const &output) {
  boost::mpi::communicator comm;
  if (comm.rank() == 0) {
    std::cout << "Gathered particle masses on MPI rank 0: [";
    for (auto const value : output) {
      std::cout << value << ", ";
    }
    std::cout << "\b\b]\n";
  }
}

int main(int argc, char **argv) {
  // create MPI environment
  auto mpi_env = std::make_shared<boost::mpi::environment>(argc, argv);
  boost::mpi::communicator comm;
  // create particles, each MPI rank has a unique set of particles
  populate_particles();
  comm.barrier();
  // list of particle ids for which we need to extract a property
  std::vector<int> pids = {5, 1, 4, 2};
  print_pids(pids);
  auto const output = extract_particle_masses(pids);
  print_results(output);
}

Compilation and execution:

$ mpic++ main.cpp -lboost_mpi -lboost_serialization -Wall -Wextra -g
$ mpiexec -n 4 ./a.out

Expected output:

creating particle with id 0 on MPI rank 0
creating particle with id 1 on MPI rank 0
creating particle with id 4 on MPI rank 2
creating particle with id 5 on MPI rank 2
creating particle with id 6 on MPI rank 3
creating particle with id 7 on MPI rank 3
creating particle with id 2 on MPI rank 1
creating particle with id 3 on MPI rank 1
Looking for mass of particles with id: [5, 1, 4, 2] 
Gathered particle masses on MPI rank 0: [5.5, 1.5, 4.5, 2.5]

@RudolfWeeber
Copy link
Contributor

To allow for the flexibility I described in the comment above, long term, it might be useful to split into separate funciotns

  • the trait gathering based on the id list (e.g., by a list of particle ids as in @jngrad's MEw above, but also by other criteria such as particle types in the future)
  • The reduciont operation which collcts the results from all nodes and assembles them into a final result.

The reduction operations should be shard between observables, where possible.
to my unerstanding, there are the following:

  • none: just assemble the data from all MPI ranks into one big list, as done in the MWE
  • sum: e.g. for total charge, total current (sum of charge*velocity)
  • weighted averages, e.g. by mass:
    • center of mass = 1/sum(masses) sum(posmass)
    • center of mass velocity = 1/sum(masses) sum (velmass)

etc.

@RudolfWeeber
Copy link
Contributor

In src/python/espressomd/observables.py, the Observable base class still has _so_creation_policy = "LOCAL", so they only exist on the head node. That will have to be removed as far as I unertand (global is default).

@jngrad
Copy link
Member

jngrad commented May 17, 2023

Here is a minimal working example in ESPResSo: jngrad/espresso@c56df25

@kodiakhq kodiakhq bot closed this as completed in #4748 Jul 24, 2023
kodiakhq bot added a commit that referenced this issue Jul 24, 2023
Fixes #3154

Description of changes:
- make observables run in parallel
   - only send the extracted data via MPI (the original implementation sent the entire particle data structure)
   - remove a major performance bottleneck in parallel simulations
- remove 8 callbacks from the deprecated `MpiCallbacks` framework
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants