Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the loss of the finetune #16

Closed
dailenson opened this issue Apr 1, 2019 · 5 comments
Closed

the loss of the finetune #16

dailenson opened this issue Apr 1, 2019 · 5 comments

Comments

@dailenson
Copy link

Thanks for your implementation.when I finetune the model using your train.py with the pre-trained model on my own data ,the loss didn't convergence.Do you have any suggesstion?

@zicair
Copy link

zicair commented Apr 3, 2019

Thanks for your implementation.when I finetune the model using your train.py with the pre-trained model on my own data ,the loss didn't convergence.Do you have any suggesstion?

I had the same problem,loss can only converge to about 1.0---1.5, It's no longer going down. is that the case for you??

@Ugness
Copy link
Owner

Ugness commented Apr 3, 2019

image
This is my experiment's result.
Since I reset my computer, I need to set the environment again to train it again. Please wait for that.
Since the loss is not the average value over the whole epoch and it is the loss of each batch, it may seem like diverging or vibrating. But it is still in training. So I suggest you to check

  1. The actual output image
  2. Measure the MAE / f-measure score by measure_test.py
  3. Plot the loss-epoch graph or use the smooth option in tensorboard. It may show a declining trend.

@zicair
Copy link

zicair commented Apr 3, 2019

I have another question,the loss in your code is 0.5 0.5..0.8..1,But here's the author's code
image
I would like to know why it is not the same as the one set by the author. Looking forward to your reply, thank you very much!

@Ugness
Copy link
Owner

Ugness commented Apr 4, 2019

@zicair
In the PiCANet paper Chapter 5.3,
The whole net-
work is trained end-to-end using stochastic gradient descent
(SGD) with momentum. Since deep supervision is adopted
in each decoding module, we empirically weight the losses
in D7
, D5
, D4
, · · · , D1 by 0.5, 0.5, 0.5, 0.8, 0.8, and 1,
respectively, without further tuning.
And I followed the coefficients in the paper

@RaoHaobo
Copy link

RaoHaobo commented Apr 9, 2019

@Ugness ,I change my learning_rate,this is my best train loss,
image
but I test it on PASCAL-S ,the max F-score is 0.82-0.83,I think it maybe poor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants