You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just came across a strange problem. I slightly modified some parts of adam.lua as follows:
-- Initialization
state.t = state.t or 0
-- Exponential moving average of gradient values
state.m = state.m or x.new(x:size()):zero()
-- Exponential moving average of squared gradient values
state.v = state.v or x.new(x:size()):zero()
-- A tmp tensor to hold the sqrt(v) + epsilon
state.denom = state.denom or x.new(x:size()):zero()
-- (3) learning rate decay (annealing)
local clr = lr / (1 + state.t*lrd)
state.t = state.t + 1
local biasCorrection1 = 1 - beta1^state.t
local biasCorrection2 = 1 - beta2^state.t
-- (1) evaluate f(x) and df/dx
local fx, dfdx = opfunc(x)
-- (2) weight decay
if wd ~= 0 then
dfdx:add(wd, x)
end
I changed the order of (1), (2) and (3), and placed
local biasCorrection1 = 1 - beta1^state.t
local biasCorrection2 = 1 - beta2^state.t
after state.t = state.t + 1. With such changes, the training losses can not be ensured the same even though I used the same seed. If I added a print() between state.t = state.t + 1 and local biasCorrection1 = 1 - beta1^state.t, then I can obtain the same training losses with multiple runs. The original adam.lua can produce the same results with multiple runs.
Does anyone have any idea about what might be happening?
The text was updated successfully, but these errors were encountered:
I just came across a strange problem. I slightly modified some parts of adam.lua as follows:
I changed the order of (1), (2) and (3), and placed
after
state.t = state.t + 1
. With such changes, the training losses can not be ensured the same even though I used the same seed. If I added aprint()
betweenstate.t = state.t + 1
andlocal biasCorrection1 = 1 - beta1^state.t
, then I can obtain the same training losses with multiple runs. The original adam.lua can produce the same results with multiple runs.Does anyone have any idea about what might be happening?
The text was updated successfully, but these errors were encountered: