Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Madam work in a MPI environment #204

Merged
merged 11 commits into from
Nov 1, 2022
Merged

Make Madam work in a MPI environment #204

merged 11 commits into from
Nov 1, 2022

Conversation

ziotom78
Copy link
Member

@ziotom78 ziotom78 commented Oct 17, 2022

Because of issue #201, Madam files created in a MPI environment do not contain all the TODs. This PR solves the problem by properly running over all the MPI processes.

The PR is quite huge, because the task is complex: Madam requires each detector to have its data in distinct files that must be numbered with an increasing counter. Therefore, to make the code work, this PR implements an algorithm that walks over all the MPI processes and counts how many observations for each of them contribute to each detector.

To make the code clearer to read, and to make litebird_sim easier to debug, I have added a new method to Simulation: describe_mpi_distribution(). Its purpose is to build a «map» of all the observations in every MPI process. This map is defined using the new type MpiDistributionDescr, which can be printed to get a visual representation of the way the TOD was split across observations and processes; here is an example:

# MPI rank #1

## Observation #0
- Start time: 0.0
- Duration: 21600.0 s
- 1 detector(s) (0A)
- TOD shape: 1×216000

## Observation #1
- Start time: 43200.0
- Duration: 21600.0 s
- 1 detector(s) (0A)
- TOD shape: 1×216000

# MPI rank #2

## Observation #0
- Start time: 21600.0
- Duration: 21600.0 s
- 1 detector(s) (0A)
- TOD shape: 1×216000

## Observation #1
- Start time: 64800.0
- Duration: 21600.0 s
- 1 detector(s) (0A)
- TOD shape: 1×216000

# MPI rank #3

## Observation #0
- Start time: 0.0
- Duration: 21600.0 s
- 1 detector(s) (0B)
- TOD shape: 1×216000

## Observation #1
- Start time: 43200.0
- Duration: 21600.0 s
- 1 detector(s) (0B)
- TOD shape: 1×216000

# MPI rank #4

## Observation #0
- Start time: 21600.0
- Duration: 21600.0 s
- 1 detector(s) (0B)
- TOD shape: 1×216000

## Observation #1
- Start time: 64800.0
- Duration: 21600.0 s
- 1 detector(s) (0B)
- TOD shape: 1×216000

Things to do before merging this PR:

  • Implement MpiDistributionDescr and all ancillary classes
  • Implement describe_mpi_distribution
  • Modify save_simulation_for_madam so that it uses describe_mpi_distribution to properly walk over all the MPI processes
  • Document describe_mpi_distribution and MpiDistributionDescr in the manual

@ziotom78 ziotom78 merged commit 597e23d into master Nov 1, 2022
@ziotom78 ziotom78 deleted the fix201 branch November 1, 2022 05:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant