Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many connections to MUSIC output proxies created when NEST runs MPI-parallel #1974

Closed
heplesser opened this issue Mar 10, 2021 · 4 comments · Fixed by #1983
Closed

Too many connections to MUSIC output proxies created when NEST runs MPI-parallel #1974

heplesser opened this issue Mar 10, 2021 · 4 comments · Fixed by #1983
Assignees
Labels
I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) S: High Should be handled next stale Automatic marker for inactivity, please have another look here T: Bug Wrong statements in the code or documentation

Comments

@heplesser
Copy link
Contributor

heplesser commented Mar 10, 2021

The following code should connect each parrot neuron exactly once to the MUSIC output proxy:

/nrns /parrot_neuron 2 Create def
/out /music_event_out_proxy << /port_name (out_spikes) >> Create def
nrns out Connect

Running the code on a single process, this works fine, and the following code

Rank 0 eq { (                         Src Tgt Prt Rcpt) = } if
(Rank: ) Rank cvs join ( --> Connections: ) join
<< >> GetConnections GetStatus { [[ /source /target /port /receptor ]] get } Map pcvs join =

prints this output:

                         Src Tgt Prt Rcpt
Rank: 0 --> Connections: [[1 3 0 0] [2 3 0 0]]

This is the same output we get if out were a spike recorder.

Running on two MPI processes, one would expect that each rank has one connection, from the neuron it holds to the local device. But we get two connections on each rank:

                         Src Tgt Prt Rcpt
Rank: 0 --> Connections: [[2 3 0 0] [2 3 1 0]]
Rank: 1 --> Connections: [[1 3 0 0] [1 3 1 0]]

If out is a spike recorder, we get one connection per rank as expected.

The consequence of this is that spikes are sent "double up". The example here leaves out the setting of music_channel to keep the reproducer minimal; it only affects the value of rport.

This bug must have existed for a while and most likely affects all MUSIC out proxies; I am not sure about MUSIC in proxies.

@heplesser heplesser added T: Bug Wrong statements in the code or documentation S: High Should be handled next I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) labels Mar 10, 2021
@heplesser
Copy link
Contributor Author

I believe the problem is located in ConnectionManager::connection_required(), specifically at

if ( target->local_receiver() )
{
// Connections to nodes with one node per process (MUSIC proxies
// or similar devices) have to be established by the thread of the
// target if the source is on the local process even though the
// source may be a proxy on tid.
if ( target->one_node_per_process() and source->has_proxies() )
{
return CONNECT_TO_DEVICE;
}

As a brute-force solution, I have experimented with the following fix:

  if ( target->local_receiver() )
  {
    if ( target->one_node_per_process() )
    {
      if ( kernel().node_manager.is_local_node( source ))
      {
        return CONNECT_TO_DEVICE;
      }
      else
      {
      return NO_CONNECTION;
      }
    }

It seems to solve the problem above, but I have not checked for side effects.

@heplesser
Copy link
Contributor Author

@mdjurfeldt Just an alert about this bug I found, in case MUSIC users contact you about it.

@jougs
Copy link
Contributor

jougs commented Mar 17, 2021

Your fix looks reasonable to me and I can't think of a bad side-effect. Could you please send a PR? Thanks!

heplesser added a commit to heplesser/nest-simulator that referenced this issue Mar 18, 2021
@heplesser heplesser assigned heplesser and unassigned jougs and stinebuu Jun 30, 2021
@heplesser heplesser added this to the NEST 3.2 milestone Jun 30, 2021
@github-actions
Copy link

github-actions bot commented Sep 3, 2021

Issue automatically marked stale!

@github-actions github-actions bot added the stale Automatic marker for inactivity, please have another look here label Sep 3, 2021
@terhorstd terhorstd removed this from the NEST 3.2 milestone Jan 17, 2022
heplesser pushed a commit to heplesser/nest-simulator that referenced this issue Apr 21, 2022
heplesser added a commit to heplesser/nest-simulator that referenced this issue Apr 22, 2022
@jessica-mitchell jessica-mitchell moved this to Done in Kernel Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) S: High Should be handled next stale Automatic marker for inactivity, please have another look here T: Bug Wrong statements in the code or documentation
Projects
Status: Done
4 participants