-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix V2 API #2288
Fix V2 API #2288
Conversation
@@ -324,6 +324,7 @@ class Parameter { | |||
std::vector<std::shared_ptr<IParameterUpdaterHook>> updaterHooks_; | |||
|
|||
public: | |||
void setSharedCount(int cnt) { sharedCount_ = cnt; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't find anywhere this newly added method is called? Is it necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is removed by reyoung in #2271. It's a function used internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Are we supposed to run git pull upstream develop && git push
here so to remove this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that we still need this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that shared count
should be separated from paddle::Parameter
. Because if we support train multiple neural networks(i.e. topologies) in the one process, we should let GradientMachine
to maintain shared count? Each GradientMachine
would handle one neural network and Parameter's shared count
is used inside one neural network.
Moreover, should we remove the logic about backward callback or pipeline in communication? It gives us not much performance gain but makes our this logic a little bit complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
看起来shared count应该脱离paddle::Parameter
放入GradientMachine
中。因为我们要支持在一个进程中同时训练多个神经网络,而shared count只和一个神经网络有关系。所以应该将shared_count放入GradientMachine中。
更进一步,我们是不是应该将pipeline通信和backward中的callback去掉?因为虽然这个功能给Paddle的多机通信带来了一点性能提升,但是增加了一些实现的复杂性。如果去掉这个功能可以简化Paddle的实现。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this PR is a fix. How about we do the successive edits in future PRs? @reyoung
@@ -3371,7 +3371,7 @@ def Import(config_file, local_args={}): | |||
return Import | |||
|
|||
|
|||
settings = dict( | |||
default_settings = dict( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that default_settings
is a constant? If so, according to PEP style, we should name it DEFAULT_SETTINGS
?
@@ -34,5 +37,15 @@ def parse_network_config(network_conf, config_arg_str=''): | |||
|
|||
|
|||
def parse_optimizer_config(optimizer_conf, config_arg_str=''): | |||
config = config_parser.parse_config(optimizer_conf, config_arg_str) | |||
return config.opt_config | |||
config_parser.settings = copy.deepcopy(config_parser.default_settings) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this line, could we just remove L3407?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot do that for backward compatibility.
@@ -324,6 +324,7 @@ class Parameter { | |||
std::vector<std::shared_ptr<IParameterUpdaterHook>> updaterHooks_; | |||
|
|||
public: | |||
void setSharedCount(int cnt) { sharedCount_ = cnt; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this PR is a fix. How about we do the successive edits in future PRs? @reyoung
Implementing v2 api (mainly parse_network) by cutting full model configs obtained from config_parser.
This should solve problems mentioned in #2104, i.e. #2061 #2071 #2065 #1811