Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when I use four rtx2080ti to train the maskrcnn as a baseline,the F-measure is only about 65%,is it normal? #7

Open
kapness opened this issue Jul 18, 2019 · 8 comments

Comments

@kapness
Copy link

kapness commented Jul 18, 2019

How can I get a good performance on four 11G GPUS ?

@JingChaoLiu
Copy link
Collaborator

In our training, the original Mask R-CNN indeed only achieve a F-measure of 66%. The 10% improvement in our baseline may come from: (no ablation study, no guarantee, just based on memories)

  1. Data Augmentation +6%

  2. OHEM +2%

  3. Train->Test extends to Train+Validation-> Test +1%

  4. Use the Ignore Annotation +1%

Note: the first three tricks have been elaborated in our paper. Recently,I noticed the implementation of Use the Ignore Annotation was not a part of the official implementation but from an open source repository matterport/Mask_RCNN which our private framework followed.

The main idea of Use the Ignore Annotation is when a predicted box overlaps with the groundtruth box at a high ratio, then this predicted box is labeled as ignore, in other words, neither positive nor negative. The details can be referred in build_rpn_targets of RPN and detection_targets_graph of Bbox branch. And the only difference taken from cocoapi is that the evaluation criteria, intersection / (gt_ignore_area + pred_area - intersection) < 0.001, is replaced to intersection / pred_area < 0.5 .

@kapness
Copy link
Author

kapness commented Jul 22, 2019 via email

@JingChaoLiu
Copy link
Collaborator

JingChaoLiu commented Jul 22, 2019

only compute 512 samples as box_cls and box_reg loss, not in RPN

@zuokai
Copy link

zuokai commented Jul 23, 2019

@JingChaoLiu hi, how many Data Augmentation methods do you use?

@kapness
Copy link
Author

kapness commented Aug 8, 2019 via email

@kapness
Copy link
Author

kapness commented Aug 8, 2019 via email

@JingChaoLiu
Copy link
Collaborator

when no GT after cropping (though it rarely happens), just skip any steps involving positive ROIs (bbox regression and mask generation), set the corresponding losses to 0 (just for logging) and not backward them. I guess here is a good position for ignoring these zero losses.

@JingChaoLiu
Copy link
Collaborator

By the way, all the images in ICDAR 2015 shares a same shape of 1280x720, so as mentioned in the paper, it is recommened to crop image by preserving the aspect ratio.

@JingChaoLiu JingChaoLiu mentioned this issue Aug 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants