-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance device context pool #9293
Enhance device context pool #9293
Conversation
namespace paddle { | ||
namespace platform { | ||
|
||
DeviceContextPool* DeviceContextPool::pool = nullptr; | ||
|
||
const platform::DeviceContext* DeviceContextPool::Get( | ||
const platform::Place& place) { | ||
platform::DeviceContext* DeviceContextPool::Get(const platform::Place& place) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
platform::DeviceContext* DeviceContextPool::Get(const platform::Place& place) const
@@ -65,6 +65,18 @@ bool is_cpu_place(const Place &); | |||
bool places_are_same_class(const Place &, const Place &); | |||
bool is_same_place(const Place &, const Place &); | |||
|
|||
struct PlaceHash { | |||
std::size_t operator()(const Place &p) const { | |||
constexpr size_t num_dev_bits = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 bit is not enough, the GPU box product has 32 cards in one node. Then will lead to an overlap of dev_id << num_dev_bits | p.which()
@@ -159,7 +160,7 @@ class DeviceContextPool { | |||
} | |||
|
|||
/*! \brief Return handle of single device context. */ | |||
const platform::DeviceContext* Get(const platform::Place& place); | |||
platform::DeviceContext* Get(const platform::Place& place); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should add const suffix?
@@ -159,7 +160,7 @@ class DeviceContextPool { | |||
} | |||
|
|||
/*! \brief Return handle of single device context. */ | |||
const platform::DeviceContext* Get(const platform::Place& place); | |||
platform::DeviceContext* Get(const platform::Place& place); | |||
|
|||
template <typename Place> | |||
const typename DefaultDeviceContextType<Place>::TYPE* GetByPlace( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the const prefix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
* commit '9c35b0dc1ba0ace5acf721685802a21045ea1249': (36 commits) Fix dist compile error (PaddlePaddle#9320) Fix bug for backward tanspiler when using parallel_do operator. (PaddlePaddle#9282) update fix transpiler bug Update index_en.rst (PaddlePaddle#9286) "fix mixed_vector bug" (PaddlePaddle#9319) Update index_en.rst (PaddlePaddle#9280) Adjust some contents in write_docs_en.rst for Contribue Documentation (PaddlePaddle#9147) CMake refine for HIP support. Fix CI. Reuduce memory copy when communication between trainer and pserver. (PaddlePaddle#9271) Modified build.sh and remove build_doc.sh fix doc Enhance device context pool (PaddlePaddle#9293) Device blobs are created only in training. Added testing attribute Shrink batch_norm_grad's inputs updates prepare and create op before run wip small fix initial commit ... # Conflicts: # cmake/external/eigen.cmake
No description provided.