Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add custom init grad for backward function #31540

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
d0915f8
add custom init grad for backward function
MingMingShangTian Mar 11, 2021
0bccce6
add custom init grad for backward function
MingMingShangTian Mar 11, 2021
5dac8e9
handle when the grad_tensor is none
MingMingShangTian Mar 12, 2021
ef4c7b9
handle when the grad_tensor is none
MingMingShangTian Mar 12, 2021
33b0416
fix the args type error on windows platform
MingMingShangTian Mar 15, 2021
837e26b
modify the args order and doc
MingMingShangTian Mar 15, 2021
1901970
format code
MingMingShangTian Mar 15, 2021
55e0cfb
add grad_tensor to xpu
MingMingShangTian Mar 15, 2021
8271dc0
modify the grad_tensor type check
MingMingShangTian Mar 16, 2021
5af3bd0
add paddle.backward api to support multi tensors gradient compute
MingMingShangTian Mar 18, 2021
1467feb
add paddle.backward api to support multi tensors gradient compute
MingMingShangTian Mar 18, 2021
eb267fa
add paddle.atuograd module and backward api
MingMingShangTian Mar 19, 2021
b80f449
Merge branch 'develop' into custom_staring_grad
MingMingShangTian Mar 23, 2021
2bb8f3c
change tensor.backward func args
MingMingShangTian Mar 23, 2021
41b375f
modify tensor backward api
MingMingShangTian Mar 23, 2021
6974e5c
remove create_graph intputs args
MingMingShangTian Mar 23, 2021
1e3e975
add doc and examplex code for backward api
MingMingShangTian Mar 24, 2021
c7de011
when have the same tensor, throw error
MingMingShangTian Mar 24, 2021
2f2824c
modify test Init func args
MingMingShangTian Mar 24, 2021
8415df4
modify the execute.Init func args in test files
MingMingShangTian Mar 24, 2021
be065e4
add paddle.autograd package in setup.py.in
MingMingShangTian Mar 24, 2021
7f8e58c
modify error msg, remove _run_backward method in class Tensor
MingMingShangTian Mar 29, 2021
0374c0b
add test cases for backward api
MingMingShangTian Mar 30, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 72 additions & 43 deletions paddle/fluid/imperative/basic_engine.cc
Original file line number Diff line number Diff line change
Expand Up @@ -36,48 +36,73 @@ DECLARE_bool(sort_sum_gradient);
namespace paddle {
namespace imperative {

void BasicEngine::Init(VarBase* var, bool retain_graph) {
void BasicEngine::Init(
const std::vector<std::shared_ptr<VarBase>>& tensors,
const std::vector<std::shared_ptr<VarBase>>& grad_tensors,
bool retain_graph) {
retain_graph_ = retain_graph;
init_node_ = var->GradVarBase()->GradNode();
PADDLE_ENFORCE_EQ(var->GradVarBase()->GraphIsFreed(), false,
platform::errors::Unavailable(
"%s trying to backward through the same graph a second "
"time, but this graph have already been freed. Please "
"specify Tensor.backward(retain_graph=True) when "
"calling backward at the first time.",
var->Name()));

if (!retain_graph) {
VLOG(5) << "Clear the auto-grad graph from grad var " << var->Name()
<< " because of retain_graph=False when calling backward";
var->GradVarBase()->SetGraphIsFreed(true);
var->GradVarBase()->ClearGradNode();
}

if (init_node_ == nullptr || var->OverridedStopGradient()) {
VLOG(3) << "Skip auto grad since there is no grad op for var or loss is "
"stop_gradient=True: "
<< var->Name();
return;
}
PADDLE_ENFORCE_EQ(
tensors.size(), grad_tensors.size(),
platform::errors::Unavailable(
"The size of tensors do not equal the size of grad_tensors,"
"the size of tensors is %s, but the size of grad_tensors is %s.",
tensors.size(), grad_tensors.size()));

for (size_t i = 0; i < tensors.size(); ++i) {
auto var = tensors[i];
auto grad_tensor = grad_tensors[i];

auto init_node = var->GradVarBase()->GradNode();
PADDLE_ENFORCE_EQ(
var->GradVarBase()->GraphIsFreed(), false,
platform::errors::Unavailable(
"%s trying to backward through the same graph a second "
"time, but this graph have already been freed. Please "
"specify Tensor.backward(retain_graph=True) when "
"calling backward at the first time.",
var->Name()));

if (!retain_graph) {
VLOG(5) << "Clear the auto-grad graph from grad var " << var->Name()
<< " because of retain_graph=False when calling backward";
var->GradVarBase()->SetGraphIsFreed(true);
var->GradVarBase()->ClearGradNode();
}

VLOG(3) << "Init first node of backward";
if (init_node == nullptr || var->OverridedStopGradient()) {
VLOG(3) << "Skip auto grad since there is no grad op for var or loss is "
"stop_gradient=True: "
<< var->Name();
continue;
}

PADDLE_ENFORCE_EQ(
var->HasGradVar(), true,
platform::errors::NotFound("Grad variable not exist for variable %s",
var->Name()));

auto& fwd_var = var->Var().Get<framework::LoDTensor>();
auto* grad_var =
var->GradVarBase()->MutableVar()->GetMutable<framework::LoDTensor>();
VLOG(6) << "init loss grad:" << var->GradVarBase()->Name()
<< " as stop_gradient false";
var->GradVarBase()->InnerSetOverridedStopGradient(false);
auto* dev_ctx = platform::DeviceContextPool::Instance().Get(fwd_var.place());
grad_var->Resize(fwd_var.dims());
grad_var->mutable_data(fwd_var.place(), fwd_var.type());
operators::math::set_constant(*dev_ctx, grad_var, 1.0);
VLOG(3) << "Init node of backward";

PADDLE_ENFORCE_EQ(
var->HasGradVar(), true,
platform::errors::NotFound("Tensor %s has no gradient", var->Name()));

auto& fwd_var = var->Var().Get<framework::LoDTensor>();
auto* grad_var =
var->GradVarBase()->MutableVar()->GetMutable<framework::LoDTensor>();
VLOG(6) << "init loss grad:" << var->GradVarBase()->Name()
<< " as stop_gradient false";
var->GradVarBase()->InnerSetOverridedStopGradient(false);
auto* dev_ctx =
platform::DeviceContextPool::Instance().Get(fwd_var.place());
if (grad_tensor == nullptr) {
grad_var->Resize(fwd_var.dims());
grad_var->mutable_data(fwd_var.place(), fwd_var.type());
operators::math::set_constant(*dev_ctx, grad_var, 1.0);
} else {
paddle::framework::TensorCopy(
grad_tensor->Var().Get<framework::LoDTensor>(), fwd_var.place(),
*dev_ctx, grad_var);
}

init_nodes_.push_back(init_node);
}
}

void BasicEngine::CheckBackwardInputs(const OpBase& op) {
Expand Down Expand Up @@ -235,8 +260,10 @@ void BasicEngine::PrepareDeps() {
std::queue<GradOpNode*> q;
std::unordered_set<GradOpNode*> visited;

q.push(init_node_.get());
visited.insert(init_node_.get());
for (size_t i = 0; i < init_nodes_.size(); ++i) {
q.push(init_nodes_[i].get());
visited.insert(init_nodes_[i].get());
}

while (!q.empty()) {
auto* cur_node = q.front();
Expand All @@ -263,14 +290,16 @@ void BasicEngine::PrepareDeps() {
}

void BasicEngine::Execute() {
if (init_node_ == nullptr) {
if (init_nodes_.empty()) {
return;
}

PrepareDeps();
// Start execute Computation graph
std::queue<std::shared_ptr<GradOpNode>> q;
q.push(std::move(init_node_));
for (size_t i = 0; i < init_nodes_.size(); ++i) {
q.push(std::move(init_nodes_[i]));
}

size_t op_num = 0;

Expand Down Expand Up @@ -470,7 +499,7 @@ void BasicEngine::Execute() {
}

void BasicEngine::Clear() {
init_node_.reset();
init_nodes_.clear();
node_deps_.clear();
accumulators_.clear();
accumulators_with_grad_node_.clear();
Expand Down
6 changes: 4 additions & 2 deletions paddle/fluid/imperative/basic_engine.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,9 @@ class OpBase;

class BasicEngine : public Engine {
public:
void Init(VarBase* var, bool retain_graph = false);
void Init(const std::vector<std::shared_ptr<VarBase>>& tensors,
const std::vector<std::shared_ptr<VarBase>>& grad_tensors,
bool retain_graph = false);

void Execute() override;

Expand All @@ -46,7 +48,7 @@ class BasicEngine : public Engine {
void Clear();

private:
std::shared_ptr<GradOpNode> init_node_;
std::vector<std::shared_ptr<GradOpNode>> init_nodes_;
std::unordered_map<GradOpNode*, size_t> node_deps_;
// The input and output of Inplace op are the same. If only `var` is used
// as the key, then the input and output of inplace op must be gradient
Expand Down
8 changes: 6 additions & 2 deletions paddle/fluid/imperative/tests/test_hooks.cc
Original file line number Diff line number Diff line change
Expand Up @@ -93,8 +93,10 @@ TEST(TestHooks, TestGradVarLeafBackwardHook) {
ASSERT_EQ(out->GradVarBase()->GradOpNum(), 1UL);

// 3. backward
std::vector<std::shared_ptr<imperative::VarBase>> tensors{out};
std::vector<std::shared_ptr<imperative::VarBase>> grad_tensors{nullptr};
BasicEngine engine;
engine.Init(out.get());
engine.Init(tensors, grad_tensors);
engine.Execute();

framework::LoDTensor x_grad;
Expand Down Expand Up @@ -193,8 +195,10 @@ void GradVarLeafBackwardHookWithGradAccmulatedTest() {
ASSERT_EQ(out->GradVarBase()->GradOpNum(), 1UL);

// 3. backward
std::vector<std::shared_ptr<imperative::VarBase>> tensors{out};
std::vector<std::shared_ptr<imperative::VarBase>> grad_tensors{nullptr};
BasicEngine engine;
engine.Init(out.get());
engine.Init(tensors, grad_tensors);
engine.Execute();

framework::LoDTensor x_grad;
Expand Down
9 changes: 7 additions & 2 deletions paddle/fluid/imperative/tests/test_tracer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -250,7 +250,10 @@ TEST(test_tracer, test_trace_op_with_multi_device_inputs) {
tracer.TraceOp("reduce_sum", reduce_in, reduce_out, reduce_attr_map,
gpu_place, true);
imperative::BasicEngine engine;
engine.Init(reduce_sum_out.get());

std::vector<std::shared_ptr<imperative::VarBase>> tensors{reduce_sum_out};
std::vector<std::shared_ptr<imperative::VarBase>> grad_tensors{nullptr};
engine.Init(tensors, grad_tensors);
engine.Execute();

framework::LoDTensor rlt;
Expand Down Expand Up @@ -376,8 +379,10 @@ TEST(test_tracer, test_var_without_grad_var) {
ASSERT_EQ(y_in->GradVarBase()->GradOpNum(), 0UL);
ASSERT_EQ(vout->GradVarBase()->GradOpNum(), 1UL);

std::vector<std::shared_ptr<imperative::VarBase>> tensors{vout};
std::vector<std::shared_ptr<imperative::VarBase>> grad_tensors{nullptr};
imperative::BasicEngine engine;
engine.Init(vout.get());
engine.Init(tensors, grad_tensors);
engine.Execute();

// check the grad
Expand Down
26 changes: 14 additions & 12 deletions paddle/fluid/pybind/imperative.cc
Original file line number Diff line number Diff line change
Expand Up @@ -720,6 +720,7 @@ void BindImperative(py::module *m_ptr) {
Bump the version whenever the Tensor is modified through an inplace operation.
)DOC")
.def("numpy",

[](imperative::VarBase &self) -> py::array {
const auto &tensor =
self.MutableVar()->Get<framework::LoDTensor>();
Expand Down Expand Up @@ -918,18 +919,6 @@ void BindImperative(py::module *m_ptr) {
print(x.stop_gradient) # True
print(x.grad) # None
)DOC")
.def("_run_backward",
[](imperative::VarBase &self, const imperative::Tracer &tracer,
bool retain_graph) {
// TODO(jiabin): when we impl more backward execution we can
// select them
auto *engine = tracer.GetEngine();
engine->Init(&self, retain_graph);
VLOG(3) << "Start backward";
engine->Execute();
VLOG(3) << "Finish backward";
},
py::call_guard<py::gil_scoped_release>())
.def("_grad_name", &imperative::VarBase::GradVarName)
.def("_grad_value",
[](imperative::VarBase &self) {
Expand Down Expand Up @@ -1412,6 +1401,19 @@ void BindImperative(py::module *m_ptr) {
},
py::call_guard<py::gil_scoped_release>());

m.def(
"dygraph_run_backward",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method no need show to users, use _ in the begining of method name, maybe still use _run_backward is better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dygraph_run_backward -> _run_backward, 不带下划线开头的是公开API,这个API不需要开放给用户

[](const std::vector<std::shared_ptr<imperative::VarBase>> &tensors,
const std::vector<std::shared_ptr<imperative::VarBase>> &grad_tensors,
bool retain_graph, const imperative::Tracer &tracer) {
auto *engine = tracer.GetEngine();
engine->Init(tensors, grad_tensors, retain_graph);
VLOG(3) << "Start backward";
engine->Execute();
VLOG(3) << "Finish backward";
},
py::call_guard<py::gil_scoped_release>());

#if defined(PADDLE_WITH_NCCL) || defined(PADDLE_WITH_RCCL) || \
defined(PADDLE_WITH_XPU_BKCL)
py::class_<imperative::ParallelContext,
Expand Down
1 change: 1 addition & 0 deletions python/paddle/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
import paddle.device
import paddle.regularizer
import paddle.incubate
import paddle.autograd

# TODO: define alias in tensor and framework directory

Expand Down
22 changes: 22 additions & 0 deletions python/paddle/autograd/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from ..fluid.dygraph.base import grad #DEFINE_ALIAS

from . import backward_mode
from .backward_mode import backward

__all__ = ['grad']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also need backward in all here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the next line __all__ += backward_mode.__all__ add the backward


__all__ += backward_mode.__all__
Loading