-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add margin rank loss operator #4285
Conversation
auto x2_dims = ctx.Input<framework::Tensor>("X2")->dims(); | ||
PADDLE_ENFORCE((label_dims == x1_dims) && (x1_dims == x2_dims) && | ||
(label_dims.size() == 2) && (label_dims[1] == 1), | ||
"All inputs must be vector with the same size"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- "All inputs must be a vector with the same size."
- If the comment is a complete sentence, please add the commas at the end of the sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
MarginRankLossOpMaker(framework::OpProto *proto, | ||
framework::OpAttrChecker *op_checker) | ||
: OpProtoAndCheckerMaker(proto, op_checker) { | ||
AddInput("X1", "The first variable to be ranked, row vector."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- All the comments should follow our conventions, by following the format of (type, default value) usage style. https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/name_convention.md
- Personally, I don't prefer to "the first/the second", I think you can just simply write comments like this (Just for example): the input 2-D tensor with shape [N x 1], where N is the batch size. In pairwise ranking, X1 is a score for an individual item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
framework::OpAttrChecker *op_checker) | ||
: OpProtoAndCheckerMaker(proto, op_checker) { | ||
AddInput("X1", "The first variable to be ranked, row vector."); | ||
AddInput("X2", "The second variable to be ranked, row vector."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refine the comments as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
AddInput("X2", "The second variable to be ranked, row vector."); | ||
AddInput("Label", | ||
"The label indicating X1 ranked higher than X2 " | ||
"or not, row vector."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- a row vector. I think a better way is like this: A 2-D tensor with shape [N x 1]. (N has already been explained above in X1.)
- Please do not forget the article.
- Please add NOTE: the label can only be +1 or -1. (If I understand right.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
MarginRankLoss operator measures the loss given a pair of input {`X1`, `X2`} | ||
and the `Label` with attribute `margin`, where `Label = 1` indicating X1 is | ||
ranked higher than `X2`, otherwise `Label = -1`. The loss turns out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MarginRankLoss operator measures the loss given a pair of input {X1
, X2
} and the Label
with a margin, where Label = 1
indicating X1 is ranked higher than X2
, otherwise Label = -1
.
The attribute margin
helps to make predictions more robust. If the negative item’s prediction exceeds that of the positive item plus a margin, then it contributes to the final loss, otherwise, does not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
and the `Label` with attribute `margin`, where `Label = 1` indicating X1 is | ||
ranked higher than `X2`, otherwise `Label = -1`. The loss turns out | ||
|
||
loss(X1, X2, Label) = max(0, -Label * (X1 - X2) + margin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From this equation, I think you should add "The label can only be +1 or -1" into comments of "Label".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"Intermediate tensor to indicate whether each element of " | ||
"Output(Out) is activated.") | ||
.AsIntermediate(); | ||
AddOutput("Out", "The output loss of MarginRankLoss operator"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fix the doc by following: (type, default value) usage style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"or not, row vector."); | ||
AddAttr<AttrType>("margin", "Margin for MarginRankLossOp, scalar.") | ||
.SetDefault(0); | ||
AddOutput("Activated", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fix the doc by following: (type, default value) usage style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
PADDLE_ENFORCE_NOT_NULL(ctx.InputVar("Label"), | ||
"Input(Label) shouldn't be null"); | ||
PADDLE_ENFORCE_NOT_NULL(ctx.InputVar("X1"), "Input(X1) shouldn't be null"); | ||
PADDLE_ENFORCE_NOT_NULL(ctx.InputVar("X2"), "Input(X2) shouldn't be null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should here also check the output Var "Out" is not null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
(label_dims.size() == 2) && (label_dims[1] == 1), | ||
"All inputs must be vector with the same size"); | ||
ctx.Output<framework::LoDTensor>("Activated")->Resize(label_dims); | ||
ctx.Output<framework::LoDTensor>("Out")->Resize(label_dims); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should here add the following codes? I am not sure, because for this operator the input X1, X2, and the output are always non-sequence. In this case, are the codes below still necessary? @qingqing01
ctx.ShareLoD("X1", /*->*/ "Out");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not necessary here
(label_dims.size() == 2) && (label_dims[1] == 1), | ||
"All inputs must be vector with the same size"); | ||
ctx.Output<framework::LoDTensor>("Activated")->Resize(label_dims); | ||
ctx.Output<framework::LoDTensor>("Out")->Resize(label_dims); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure Activated
and Out
are not nullptr
before resize them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
MarginRankLossOpMaker(framework::OpProto *proto, | ||
framework::OpAttrChecker *op_checker) | ||
: OpProtoAndCheckerMaker(proto, op_checker) { | ||
AddInput("X1", "The first variable to be ranked, row vector."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are row vectors? not column vectors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A mistake, corrected.
AddInput("Label", | ||
"The label indicating X1 ranked higher than X2 " | ||
"or not, row vector."); | ||
AddAttr<AttrType>("margin", "Margin for MarginRankLossOp, scalar.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In class MarginRankLossKernel
, we can see that AttrType
should be consistent with T
. So maybe using T
directly is a better practice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"The label indicating X1 ranked higher than X2 " | ||
"or not, row vector."); | ||
AddAttr<AttrType>("margin", "Margin for MarginRankLossOp, scalar.") | ||
.SetDefault(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 should be const_cast
to AttrType
first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
return static_cast<T>(0); | ||
} else { | ||
return val; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return val < 0 ? static_cast<T>(0) : val;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} else { | ||
return static_cast<T>(0); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return static_cast<T>(val > 0 ? 1 : 0);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lcy-seso @Canpio Thanks for all comments. Updated this operator, please continue to review
PADDLE_ENFORCE_NOT_NULL(ctx.InputVar("Label"), | ||
"Input(Label) shouldn't be null"); | ||
PADDLE_ENFORCE_NOT_NULL(ctx.InputVar("X1"), "Input(X1) shouldn't be null"); | ||
PADDLE_ENFORCE_NOT_NULL(ctx.InputVar("X2"), "Input(X2) shouldn't be null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
auto x2_dims = ctx.Input<framework::Tensor>("X2")->dims(); | ||
PADDLE_ENFORCE((label_dims == x1_dims) && (x1_dims == x2_dims) && | ||
(label_dims.size() == 2) && (label_dims[1] == 1), | ||
"All inputs must be vector with the same size"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
(label_dims.size() == 2) && (label_dims[1] == 1), | ||
"All inputs must be vector with the same size"); | ||
ctx.Output<framework::LoDTensor>("Activated")->Resize(label_dims); | ||
ctx.Output<framework::LoDTensor>("Out")->Resize(label_dims); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
(label_dims.size() == 2) && (label_dims[1] == 1), | ||
"All inputs must be vector with the same size"); | ||
ctx.Output<framework::LoDTensor>("Activated")->Resize(label_dims); | ||
ctx.Output<framework::LoDTensor>("Out")->Resize(label_dims); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not necessary here
MarginRankLossOpMaker(framework::OpProto *proto, | ||
framework::OpAttrChecker *op_checker) | ||
: OpProtoAndCheckerMaker(proto, op_checker) { | ||
AddInput("X1", "The first variable to be ranked, row vector."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A mistake, corrected.
"Intermediate tensor to indicate whether each element of " | ||
"Output(Out) is activated.") | ||
.AsIntermediate(); | ||
AddOutput("Out", "The output loss of MarginRankLoss operator"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
MarginRankLoss operator measures the loss given a pair of input {`X1`, `X2`} | ||
and the `Label` with attribute `margin`, where `Label = 1` indicating X1 is | ||
ranked higher than `X2`, otherwise `Label = -1`. The loss turns out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
and the `Label` with attribute `margin`, where `Label = 1` indicating X1 is | ||
ranked higher than `X2`, otherwise `Label = -1`. The loss turns out | ||
|
||
loss(X1, X2, Label) = max(0, -Label * (X1 - X2) + margin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
return static_cast<T>(0); | ||
} else { | ||
return val; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} else { | ||
return static_cast<T>(0); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Please do not forget to merge the latest develop branch. |
(label_dims.size() == 2) && (label_dims[1] == 1), | ||
"All inputs must be vector with the same size."); | ||
auto act_t = ctx.Output<framework::LoDTensor>("Activated"); | ||
auto out_t = ctx.Output<framework::LoDTensor>("Out"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The InferShape
interface has been changed in the latest develop branch, please do not forget to update the codes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"X2 is the score for another item to be ranked."); | ||
AddInput("Label", | ||
"(2-D tensor with shape [batch_size x 1]) " | ||
"The label indicating X1 ranked higher than X2 or not, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The label indicating X1 should be ranked higher than X2 or not, "
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should
is not needed for the label comes from training data
loss(X1, X2, Label) = max(0, -Label * (X1 - X2) + margin) | ||
|
||
The attribute `margin` involved here helps make the predictions more robust. | ||
Only when the difference between `X1` and `X2` is greater than `margin`, it is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to measure the difference between x1
and x2
? They are two instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified the doc. Here X1 and X2 stand for the score for the two items
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the doc
"X2 is the score for another item to be ranked."); | ||
AddInput("Label", | ||
"(2-D tensor with shape [batch_size x 1]) " | ||
"The label indicating X1 ranked higher than X2 or not, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should
is not needed for the label comes from training data
loss(X1, X2, Label) = max(0, -Label * (X1 - X2) + margin) | ||
|
||
The attribute `margin` involved here helps make the predictions more robust. | ||
Only when the difference between `X1` and `X2` is greater than `margin`, it is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified the doc. Here X1 and X2 stand for the score for the two items
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Resolve #4234