You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am apologising in advance if I misunderstood something.
My issue comes from the observation, that when I raise the lora rank (4 -> 512) the observable effective learning rate (difference between the sampled images through training) drastically drops.
So I come to the source code and see https://github.com/kohya-ss/sd-scripts/blob/c93cbbc373daff7827395b6ca5bde91733890722/networks/lora.py#L52 self.scale = alpha / self.lora_dim
In my understanding, the right way to implement the equalised learning rate https://arxiv.org/abs/1812.04948 , should be the following: self.scale = alpha / (in_dim**0.5) / (self.lora_dim**0.5) (in_dim**0.5) divider for the down_sample layer and (self.lora_dim**0.5) for the up_sample layer.
Thank you.
The text was updated successfully, but these errors were encountered:
I am apologising in advance if I misunderstood something.
My issue comes from the observation, that when I raise the lora rank (4 -> 512) the observable effective learning rate (difference between the sampled images through training) drastically drops.
So I come to the source code and see https://github.com/kohya-ss/sd-scripts/blob/c93cbbc373daff7827395b6ca5bde91733890722/networks/lora.py#L52
self.scale = alpha / self.lora_dim
In my understanding, the right way to implement the equalised learning rate https://arxiv.org/abs/1812.04948 , should be the following:
self.scale = alpha / (in_dim**0.5) / (self.lora_dim**0.5)
(in_dim**0.5)
divider for the down_sample layer and(self.lora_dim**0.5)
for the up_sample layer.Thank you.
The text was updated successfully, but these errors were encountered: