Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve get_connections performance #490

Merged
merged 6 commits into from
Oct 3, 2016

Conversation

alexeyshusharin
Copy link
Contributor

Hi All,

I found that GetConnections is quite slow and consumes unreasonable amount of memory. Quick investigation has shown that ConnectionManager::get_connections pre-allocates memory for all connections in the model even if user requests connections of a single node. It reserves space only for Token (16 bytes), but that become megabytes in large models. In addition the memory consumption is doubled by similar per-thread reservation. Obviously it also decreases the performance.

This PR improves get_connections by:

  • Using stl deque for collecting connection data. Only final result is converted to ArrayDatum. This allows to avoid memory pre-allocation.
  • Removing per-thread connection number calculation since reservation is not required anymore.
  • Skipping iteration through thread connections if there are no connections with specified syn_id.

The simple test shows significant improvement of get_connection performance. It's about 118 times faster for requsting connections of a single source node. 55 times faster for a single target node. And 50% faster for requesting all connections (BTW, most time is spent outside c++ get_connection method).

The Python script measures average time of GetConnection call duration. It was run on my desktop with enabled OpenMP (16 threads).

Configuration:
Num of source nodes = 2500
Num of target nodes = 2500
Num of connections per node = 701
Num of all connections = 1757571
Master
------------------------------------------
Average time for getting connection is
per source =  0:00:00.463445
per target =  0:00:00.586650
all        =  0:00:01.772015
deque based
------------------------------------------
Average time for getting connection is
per source =  0:00:00.003906
per target =  0:00:00.010477
all        =  0:00:01.171967

@heplesser
Copy link
Contributor

@jougs Could you take a look?

Copy link
Contributor

@jougs jougs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexeyshusharin: many thanks for this valuable contribution! I had a look at the code and it looks sound throughout.

Copy link
Contributor

@heplesser heplesser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexeyshusharin I agree with @jougs that your code is a very useful contribution to NEST. It is almost perfect, but would benefit from a minor refactoring, see my inline comment.

Before integrating your code, we would need a signed contributor license agreement, which you can find at http://nest.github.io/nest-simulator/NEST_Contributor_Agreement.pdf . For more information, see towards the bottom of http://nest.github.io/nest-simulator/

@@ -1122,7 +1116,13 @@ nest::ConnectionManager::get_connections( ArrayDatum& connectome,
#ifdef _OPENMP
#pragma omp critical( get_connections )
#endif
connectome.append_move( conns_in_thread );
{
while ( !conns_in_thread.empty() )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexeyshusharin This code repeats two more times below, so I would suggest to refactor it into a method.

Two more details:
In newly written NEST code, logical operators (!, &&, ÌI) should be written out asnot,and,or` for greater clarity.

I am also wondering if std::copy() wouldn't be faster and more expressive, although it would come at the price of temporarily having two copies of the connectivity data. But since the doubling is per-thread only, I think that would not be too bad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heplesser, @jougs Thanks for review!

In newly written NEST code, logical operators (!, &&, ÌI) should be written out asnot,and,or` for greater clarity.

Sure, will change it.

I am also wondering if std::copy() wouldn't be faster and more expressive, although it would come at the price of temporarily having two copies of the connectivity data. But since the doubling is per-thread only, I think that would not be too bad.

Actually I did use std::copy() in the first version of the changes and I didn't notice any performance benefits of using it. Maybe it's faster but considering the whole method execution time it's not significant. So I decided that memory is more important than just nicely looking code. If you agree I will move this code to a separate method as you suggested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexeyshusharin Fine! Moving the code to a method with an informative name will make the code as readable as with std::copy() and save a bit of memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heplesser Done!

Copy link
Contributor

@heplesser heplesser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexeyshusharin Nice work, I just have two more small suggestions.

conns_in_thread.pop_front();
}
}
append_move_connections_( conns_in_thread, connectome );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine in principle, but the data flow is not clear to the reader. If the method returned the reference to the connectome, one could write

connectome = append_from_thread( connectome, conns_in_thread );

In my eyes that would be much easier to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heplesser Thanks for the comments!
Maybe you also like something like this. Looks good from my point of view. :)

static inline std::deque< nest::ConnectionID >&
operator<<( std::deque< nest::ConnectionID >& out,
  std::deque< nest::ConnectionID >& in )
{
  while ( not in.empty() )
  {
    out.push_back( in.front() );
    in.pop_front();
  }

  return out;
}

And then just write
connectome << conns_in_thread;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexeyshusharin Beautiful!

@@ -393,6 +400,17 @@ ConnectionManager::get_max_delay() const
return max_delay_;
}

inline void
ConnectionManager::append_move_connections_( std::deque< ConnectionID >& input,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inlining makes sense here, but the code should go into the cpp-file, since the method is used only there. In that way, changes to the code in the method won't require recompilation of all files including connection_manager.h.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I, personally, prefer to implement it as a static inline method in cpp file (local for cpp). But grep gave me just one place in the code where this approach is used. So I decided that this is not common practice for nest and made it as a class member.
Do you think it's ok to declare this method as static and place it just before get_connections in cpp file? Or do you think that moving implementation to cpp and leaving declaration in hpp is better approach?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexeyshusharin I suggest to replace the function with operator<< as you suggested, that is just so much more beautiful.

When it comes to where and how to define the operator, I am a bit torn between a plain function and a private static method. But since the operator does not access or modify any data members of ConnectionManager, I would suggest defining it as a plain function outside the ConnectionManager class and placing that definition in the cpp file, since the operator is only used there. The definition then serves as declaration as well, so nothing needs to go into the header file.

Copy link
Contributor

@heplesser heplesser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexeyshusharin Looks good!

@heplesser heplesser merged commit 9a4f3c4 into nest:master Oct 3, 2016
@alexeyshusharin alexeyshusharin deleted the fix_get_connection branch October 3, 2016 22:27
jakobj added a commit to jakobj/nest-simulator that referenced this pull request Mar 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants