[ LORA ] Update FC Layer to support LoRA's incremental forwarding & batch_size option. #2728

EunjuYang · 2024-09-05T05:07:18Z

This pull request (PR) consists of two commits:

Update 'incremental_forwarding' for 'FullyConnectedLayer'.
- Code updated to support multiple batches .
- Code modified to work with incremental forwarding using LoRA.
Fix bugs in 'fc_layer.cpp/h' handling of batch size with LoRA.
- In the previous version, tensor dimensions used in LoRA computations did not account for batch size.
- 'setBatch' function has been overrided to correctly update the batch size of tensors.

Self evaluation:
Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

- This commit add some codes to support LoRA in incremental_forwarding. - This commit updates the incremental_forwarding to support multiple batch input. However, it is not the desirable way in that it cannot be parallelized across the batch axis. I left this issue on the comment. Self evaluation: Build test: [X]Passed [ ]Failed [ ]Skipped Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Eunju Yang <[email protected]>

taos-ci · 2024-09-05T05:07:22Z

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2728. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

skykongkong8 · 2024-09-05T05:18:38Z

nntrainer/layers/fc_layer.cpp

+      input_step.dot(loraA, hidden_tmp_lora, false, false);
+      hidden_tmp_lora.dot(loraB, hidden_out_lora, false, false);
+      hidden_out_lora.multiply_i(lora_scaling);
+      hidden_step.add_i(hidden_out_lora);


Pure question:
won't there be any possibility for lora_scaling to be defined as a scalar by any chance?
Although it is not supported in the nntrainer yet, if it is a scalar,
Then we can do something like:

hidden_tmp_lora.dot(loraB, hidden_out_lora, false, false, lora_scaling /*alpha*/, 0 /*beta*/);

in optimized GEMM case, computing everything in fused-ops..
(Or, even if it is a vector, I can implement a fused op for optimal performance)

To do so,
We should add params like alpha and beta at Tensor::dot() -> AFAIK current nntr Tensor::dot supports beta only...

FYI)
a normal GEMM can be defined as:
$$C = alpha * A * B + beta * C$$

Thank you for asking a question !
The lora_scaling is nothing but $$\frac{\alpha}{rank}$$, which is a scalar (Please refer to https://arxiv.org/abs/2106.09685, Section 4.1). It is used for ease of finding the best rank $$r$$.
You're asking about this because there's a chance that we might need normal GEMM, right? It could be one of the cases where the generalized GEMM is used, but I'm not certain if it will be actively utilized.

e.g., The purspoe of using lora_scaling is to make a consistent of hyper-parameter across the various lora ranks. But I'm not sure it will be used in on-device training (rank will be fixed or the best rank might be explored in the PC-side)

taos-ci

@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.

djeong20

nice work! LGTM

lhs8928

Overall LGTM except minor comment.

lhs8928 · 2024-09-11T23:53:33Z

nntrainer/layers/fc_layer.cpp

@@ -148,7 +159,7 @@ void FullyConnectedLayer::finalize(InitLayerContext &context) {
                            true, TensorLifespan::FORWARD_DERIV_LIFESPAN);


This is not related with this pr but isn't loraTmp used in forward and gradient not forward and derivative?

Exactly! I'm gonna update this. Thank you for your comment.

- In the previous code, LoRA didn't work for the case batch_size > 1. - Tensors used in LoRA-related computation were not updated when the batch size is upsted. - `setBatch()` function is implemented for `FullyConnectedLayer`. - BugFix in Lifespan of loraTmp Tensor: FORWARD_DERIV_LIFESPANE -> FORWARD_GRAD_LIFESPAN Self evaluation: Build test: [X]Passed [ ]Failed [ ]Skipped Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Eunju Yang <[email protected]>

taos-ci

@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.

EunjuYang requested review from myungjoo, jijoongmoon, again4you, jaeyun-jung, leemgs, wooksong, helloahn, kparichay, gichan-jang, anyj0527, zhoonit, lhs8928, songgot, jihochu, DonghakPark, SeoHyungjun, baek2sm, skykongkong8, djeong20 and a team as code owners September 5, 2024 05:07

github-actions bot added the Need Review label Sep 5, 2024

skykongkong8 approved these changes Sep 5, 2024

View reviewed changes

taos-ci approved these changes Sep 5, 2024

View reviewed changes

djeong20 approved these changes Sep 5, 2024

View reviewed changes

github-actions bot added PR/READY2MERGE and removed Need Review labels Sep 5, 2024

lhs8928 approved these changes Sep 12, 2024

View reviewed changes

EunjuYang force-pushed the lora_incremental_fw_ branch from 36d6d66 to 7c1b98c Compare September 12, 2024 01:12

taos-ci approved these changes Sep 12, 2024

View reviewed changes

jijoongmoon merged commit 8104cbe into nnstreamer:main Sep 22, 2024
37 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ LORA ] Update FC Layer to support LoRA's incremental forwarding & batch_size option. #2728

[ LORA ] Update FC Layer to support LoRA's incremental forwarding & batch_size option. #2728

EunjuYang commented Sep 5, 2024 •

edited

Loading

taos-ci commented Sep 5, 2024

skykongkong8 Sep 5, 2024 •

edited

Loading

EunjuYang Sep 5, 2024

taos-ci left a comment

djeong20 left a comment

lhs8928 left a comment

lhs8928 Sep 11, 2024

EunjuYang Sep 12, 2024

taos-ci left a comment

		@@ -148,7 +159,7 @@ void FullyConnectedLayer::finalize(InitLayerContext &context) {
		true, TensorLifespan::FORWARD_DERIV_LIFESPAN);

[ LORA ] Update FC Layer to support LoRA's incremental forwarding & batch_size option. #2728

[ LORA ] Update FC Layer to support LoRA's incremental forwarding & batch_size option. #2728

Conversation

EunjuYang commented Sep 5, 2024 • edited Loading

taos-ci commented Sep 5, 2024

skykongkong8 Sep 5, 2024 • edited Loading

Choose a reason for hiding this comment

EunjuYang Sep 5, 2024

Choose a reason for hiding this comment

taos-ci left a comment

Choose a reason for hiding this comment

djeong20 left a comment

Choose a reason for hiding this comment

lhs8928 left a comment

Choose a reason for hiding this comment

lhs8928 Sep 11, 2024

Choose a reason for hiding this comment

EunjuYang Sep 12, 2024

Choose a reason for hiding this comment

taos-ci left a comment

Choose a reason for hiding this comment

EunjuYang commented Sep 5, 2024 •

edited

Loading

skykongkong8 Sep 5, 2024 •

edited

Loading