Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about train and test #20

Open
lizhiyuanUSTC opened this issue Dec 8, 2020 · 8 comments
Open

Question about train and test #20

lizhiyuanUSTC opened this issue Dec 8, 2020 · 8 comments

Comments

@lizhiyuanUSTC
Copy link

lizhiyuanUSTC commented Dec 8, 2020

Thanks for your great work!

I have some questions about the training and test setting. When training, no matter which class attention vector combined with the query feature, the groundtruth are same. However, the class score are connected with the attention vector when testing the performance and the score for background is from the first attention vector. The loss function did not build the connection between the predict class and the class of attention vector.

Is this strange?

@YoungXIAO13
Copy link
Owner

YoungXIAO13 commented Dec 8, 2020

Hi @lizhiyuanUSTC

Thanks for your question!

For the training procedure as well as the loss function, I simply follow the implementation proposed in Meta R-CNN (see this line).

It's true that the GroundTruth remains the same regardless of the class vector in training, while the output with highest classification score is selected in testing.

My understanding is that the box regression branch should be class-agnostic (as discussed in TFA), which means the regression output depends mainly on the RoI features with weak (or even zero) dependence on class feature.
As for the classification branch, it tries to classify which class the object in RoI belongs to. This classification branch could work by only looking at the RoI feature and learning a specific weight for each class in the supervised setting. With the class feature vector, we can have four possible cases:

  1. the class vector and the output class is the object class, loss function requires a high classification score
  2. the class vector is object class while output class is not, loss function requires a low classification score
  3. the class vector is not object class while output class is, loss function requires a high classification score
  4. neither the class vector nor the output class is the object class, loss function requires a low classification score

By using the same class vector during training, we enforce the classification branch to be robust in all the cases without a specific relying on any configuration. This should work as the network could simply ignore the class feature vector and condition the output only on the RoI features when there are many samples, while adding class vector feature could surly help in few-shot cases (as demonstrated in FSRW and Meta R-CNN).
In testing, as a class-specific prediction is required, for each combination of RoI and class, we can simply combine the RoI feature with the corresponding class vector feature.

@lizhiyuanUSTC
Copy link
Author

Thanks for your reply.

Can I say that the attention vector is a special feature augmentation in few shot object detection? I noticed that you did not freeze parameters in meta test, but training all parameters in TFA will leads a fast overfitting for novel training samples.

You say the output with highest classification score is selected in testing, but I do not see the related code in this repo. Please correct me if I am wrong.

@YoungXIAO13
Copy link
Owner

If "feature augmentation" means adding additional information to the RoI features, then "yes".

I didn't try freezing the backbone during the second stage, but I agree that it worth a try.

Sorry for the misleading phrase, I've corrected it in the answer above.

@lizhiyuanUSTC
Copy link
Author

Thanks for your reply.

I try to predict the class score and box delta with different attention vector, the results are very close.

I noticed that the released pretrain model on base classes can classify all categories, so I try to do meta test by setting the lr as 0(get the attention vector). Strangely, the mAP on COCO novel classes can achive to 10.x on different seeds, which means that the proposed method can achieve a great results without finetune.

@YoungXIAO13
Copy link
Owner

That's an interesting finding!

However, I don't quite understand: "I noticed that the released pretrain model on base classes can classify all categories"
The box classification and regression branch only have 60 classes after the base training, they could not work on the 20 novel classes, no?

@lizhiyuanUSTC
Copy link
Author

I download the pretrained weights on base classes from sh, and the shape of RCNN_cls_score.weight is [81, 4096].

@YoungXIAO13
Copy link
Owner

Ah, I forgot that the num_class is always set to be 81 for the dataset COCO.

@lizhiyuanUSTC
Copy link
Author

Can you test on novel classes using the released pretrained model? I can not understand why.

I can not replicate your result on my own experiments, unless using your released model. Is there any trick to achieve the comparable results?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants