You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I retrained the zero shot model by using train_zero_shot_youtube.sh based on the given settings, obtained the inference results based on the eval_zero_shot_youtube.sh, and then prepared the submission results based on prepare_results_submission.py for the YouTubeVOS challenge official website.
However, the test results on YouTubeVOS cannot match the public scores. Are there any other settings or tricks during training and testing? I found data argumentation is used in training while not in testing, and I absolutely did as the public settings. The models are trained for 50 epochs on a single TitanX GPU (batch_size=4, clips=5). The following is the retrained results:
retrain-RVOS-T: 33.87, 18.37, 38.62, 22.23
retrain-RVOS-S: 38.52, 18.72, 41.70, 22.59
retrain-RVOS-ST: 41.56, 21.46, 45.00, 24.52
Besides I also used the public zero shot youtube model for youtube-vos testing, I got the following scores:
pub-RVOS-ST: 43.39, 21.10, 45.30, 24.32.
It seems the inferior retrain results are not due to the test settings, but I do not know why, can you help me?
The text was updated successfully, but these errors were encountered:
There could be at least two reasons for the different results when retraining:
A new training, will see the images and the instances in a different order, and the data augmentation techniques applied will be also different. This can give as a result a model which can be slightly better or worse than the one trained and released by ourselves.
For the zero-shot case, we trained the model for 40 epochs. Even if the validation loss (obtained with the train-val subset) of the model could be better when trained for 50 epochs, this doesn't mean that the results obtained in the validation set will be better than the ones obtained by the released model.
@carlesventura Thanks for your kind answer. Then, how to choose the final model after one training (eg, 40 or more epochs)? Take the one with the best validation loss (obtained with the train-val subset)? Maybe it is overfitting. Or test several or all trained models (checkpoints) on test set? It seems not advisable. Thank you.
I retrained the zero shot model by using train_zero_shot_youtube.sh based on the given settings, obtained the inference results based on the eval_zero_shot_youtube.sh, and then prepared the submission results based on prepare_results_submission.py for the YouTubeVOS challenge official website.
However, the test results on YouTubeVOS cannot match the public scores. Are there any other settings or tricks during training and testing? I found data argumentation is used in training while not in testing, and I absolutely did as the public settings. The models are trained for 50 epochs on a single TitanX GPU (batch_size=4, clips=5). The following is the retrained results:
retrain-RVOS-T: 33.87, 18.37, 38.62, 22.23
retrain-RVOS-S: 38.52, 18.72, 41.70, 22.59
retrain-RVOS-ST: 41.56, 21.46, 45.00, 24.52
Besides I also used the public zero shot youtube model for youtube-vos testing, I got the following scores:
pub-RVOS-ST: 43.39, 21.10, 45.30, 24.32.
It seems the inferior retrain results are not due to the test settings, but I do not know why, can you help me?
The text was updated successfully, but these errors were encountered: