-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Loss&Test] Implement KLD Loss & Add Unit Test on loss layers @open sesame 10/28 13:48 #2757
Conversation
📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2757. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DonghakPark, 💯 All CI checkers are successfully verified. Thanks.
There is a typo in 'implemented |
mu.pow(2.0f, temp); // 1. temp = mu ^ 2 | ||
log_std.subtract(temp, before_sum); // 2. before_sum = log_std - temp | ||
log_std.apply<float>(expf, temp); // 3. temp = exp(log_std) - 1 | ||
temp.subtract_i(1.0f); | ||
before_sum.subtract_i(temp); // 4. before_sum = before_sum - temp | ||
before_sum.sum({1, 2, 3}, ret, -0.5); // 5. sum * 0.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand the forward implementation of KLD Loss in commit 3.
Are the written formulas and actual implementations the same?
I would like to know the difference between the formula on line 32 of kld_loss_layer.cpp and the PyTorch's KLD formula added to the PR.
In PyTorch, it is summarized as follows:
if not log_target: # default
loss_pointwise = target * (target.log() - input)
else:
loss_pointwise = target.exp() * (target - input)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have provided more detailed explanations. Upon reviewing your feedback, it seems necessary to align with other frameworks' usage. Thank you for bringing this to my attention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eunju provided additional explanations about logic through the comment below.
I think it would be great if we make a decision with our team members based on that opinion!
mu.pow(2.0f, temp); // 1. temp = mu ^ 2 | ||
log_std.subtract(temp, before_sum); // 2. before_sum = log_std - temp | ||
log_std.apply<float>(expf, temp); // 3. temp = exp(log_std) - 1 | ||
temp.subtract_i(1.0f); | ||
before_sum.subtract_i(temp); // 4. before_sum = before_sum - temp | ||
before_sum.sum({1, 2, 3}, ret, -0.5); // 5. sum * 0.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says "sum * 0.5".
However, in the code, "-0.5" is used.
Is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice Catch --> -0.5 is correct, at code i apply -0.5
|
||
void KLDLossLayer::forwarding(RunLayerContext &context, bool training) {} | ||
void KLDLossLayer::forwarding(RunLayerContext &context, bool training) { | ||
// -0.5 * sum(1 + log_std - pow(mu, 2) - exp(log_std)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is the same as torch's KLD formula, it would be great if you could write down the derivation process of the formula like in the comment for the calcDerivative function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay i will fix
|
||
void KLDLossLayer::forwarding(RunLayerContext &context, bool training) {} | ||
void KLDLossLayer::forwarding(RunLayerContext &context, bool training) { | ||
// -0.5 * sum(1 + log_std - pow(mu, 2) - exp(log_std)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Hyungjun. Torch's KLD loss has their own assumptions : (1) reverse KL divergence (2) they are assuming the value from model takes log.
For general KL divergence,
# KL(P||Q) = \sum { P (log P - log Q) }
(P * (P / Q).log()).sum()
How to use KLD loss of pytorch
# in general, kld loss is used in the form of
torch.nn.functional.kl_div(Q.log(), P, None, None, 'sum')
Like this, we need to clarify the assumption on how to use this loss function -- including input definition, etc. In this update, these kind of explanations are missing. In order to review this PR correctly, some explanations are required.
Also, I couldn't understand why you used notation of mu
and log_std
; it is not equivalent with the general kl loss I think. Do you assume KLD loss for Gaussian? (e.g., VAE KL loss - it seems this KLD loss is a specific one to compute KLD btw/ normal distribution and standard normal)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your detailed review. Based on the formula below,
https://stackoverflow.com/questions/74865368/kl-divergence-loss-equation
I have attached the Torch formula to assist in understanding. As Hyungjun and Eunju mentioned, it seems necessary to discuss which assumptions we need to implement this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your detailed explanation. As i mentioned in the comment, however, it seems the forwarding of KLD loss only covers the special case of
The link you shared refers the code of a page, which considers the special case of KL loss for training VAE, (here). I recommend you to consider updating this to make our KLD loss cover more general form :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, i see
i will update kld loss for more general cases
2f24a4e
to
3475994
Compare
cibot: @DonghakPark, A builder checker could not be completed because one of the checkers is not completed. In order to find out a reason, please go to http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2757-202410170844270.1929669380188-3475994cecf485dfba4d34fa1aeab25e0bb0695e/. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DonghakPark, 💯 All CI checkers are successfully verified. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for considering my previous comment.
Please check some suggestions below :)
/** | ||
* 1. Output = label / predicted | ||
* 2. Output = output * label | ||
* 3. Output = log(output) | ||
* 4. Output = sum(output) | ||
*/ | ||
label.divide(predicted, temp); | ||
temp.multiply_i(label); | ||
temp.apply<float>(logf, temp); | ||
output.fill(temp.sum({0, 1, 2, 3})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code seems like (P * (P / Q)).log()
which is different from P * (P/Q).log()
.
Please change the code like
/** | |
* 1. Output = label / predicted | |
* 2. Output = output * label | |
* 3. Output = log(output) | |
* 4. Output = sum(output) | |
*/ | |
label.divide(predicted, temp); | |
temp.multiply_i(label); | |
temp.apply<float>(logf, temp); | |
output.fill(temp.sum({0, 1, 2, 3})); | |
/** | |
* 1. Output = label / predicted | |
* 2. Output = log(Output) | |
* 3. Output = Output * label | |
* 4. Output = sum(output) | |
*/ | |
label.divide(predicted, temp); | |
temp.apply<float>(logf, temp); | |
temp.multiply_i(label); | |
output.fill(temp.sum({0, 1, 2, 3})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!! Thank you for check
label.divide_i(predicted); | ||
deriv.fill(label); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is just a coding style, but it is just a simple suggestion from my side:
label.divide_i(predicted); | |
deriv.fill(label); | |
label.divide(predicted, deriv); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good Point, i will apply your review
b3e0d9a
to
799f131
Compare
cibot: @DonghakPark, A builder checker could not be completed because one of the checkers is not completed. In order to find out a reason, please go to http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2757-202410231736100.5796000957489-799f131355f83f2286d6282d4c341211a970fac7/. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DonghakPark, 💯 All CI checkers are successfully verified. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good to go!
Currently, there is no unit test for the KLD loss. Therefore, I am adding a new unit test to increase code coverage. - use pre defined unit test - (LayerSemanticsParamType) **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghak PARK <[email protected]>
In the case of NNTrainer, there are many instances where one function works in response to another function, so we have created integration tests for unit testing. - There are exceptional cases depending on the activation when it comes to losses. - For instance with BN (Batch Normalization), you can compare accuracy after conducting several epochs of testing. **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghak PARK <[email protected]>
Implement KLD Loss on NNtrainer **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghak PARK <[email protected]>
Apply coding style with clang-format ( unittest files ) **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghak PARK <[email protected]>
Implement Finalize on KLD Loss **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghak PARK <[email protected]>
Update KLD Loss Function Reflect review **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghak PARK <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DonghakPark, 💯 All CI checkers are successfully verified. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DonghakPark, 💯 All CI checkers are successfully verified. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
In this PR
In this PR, performed three tasks as follows:
commit 1 - [UnitTest] Add kld-loss unit test
commit 2 - [UnitTest] Add integration_tests on unit test
commit 3 - [Loss] Implement KLD Loss Layer
Self evaluation:
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghak PARK [email protected]