-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
when I use four rtx2080ti to train the maskrcnn as a baseline,the F-measure is only about 65%,is it normal? #7
Comments
In our training, the original Mask R-CNN indeed only achieve a F-measure of 66%. The 10% improvement in our baseline may come from: (no ablation study, no guarantee, just based on memories)
Note: the first three tricks have been elaborated in our paper. Recently,I noticed the implementation of The main idea of |
Thanks very much for your reply.now I have a new question,in ohem process,the paper says you select 512 difficult samples to update the network,does it mean you only provide 512 samples to ROI heads,or you only compute 512 samples as RPN loss?
…---Original---
From: "JingChaoLiu"<[email protected]>
Date: Mon, Jul 22, 2019 00:12 AM
To: "STVIR/PMTD"<[email protected]>;
Cc: "kapness"<[email protected]>;"Author"<[email protected]>;
Subject: Re: [STVIR/PMTD] when I use four rtx2080ti to train the maskrcnn as a baseline,the F-measure is only about 65%,is it normal? (#7)
In our training, the original Mask R-CNN indeed only achieve a F-measure of 66%. The 10% improvement in our baseline may come from: (no ablation study, no guarantee, just based on memories)
Data Augmentation +6%
OHEM +2%
Train->Test extends to Train+Validation-> Test +1%
Use the Ignore Annotation +1%
Note: the first three tricks have been elaborated in our paper. Recently,I noticed the implementation of Use the Ignore Annotation was not a part of the official implementation but from an open source repository matterport/Mask_RCNN which our private framework followed.
The main idea of Use the Ignore Annotation is when a predicted box overlaps with the groundtruth box at a high ratio, then this predicted box is labeled as ignore, in other words, neither positive nor negative. The details can be referred in build_rpn_targets of RPN and detection_targets_graph of Bbox branch. And the only difference taken from cocoapi is that the evaluation criteria, intersection / (gt_ignore_area + pred_area - intersection) < 0.001, is replaced to intersection / pred_area < 0.5 .
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
only compute 512 samples as box_cls and box_reg loss, not in RPN |
@JingChaoLiu hi, how many Data Augmentation methods do you use? |
hi ,now I have a small problem,in random crop process , do you make sure that every cropped region has at least one clear GT box ?because I find that maskrcnn can't compute loss on a picture with no GT box..
thanks for your kindness again!
…---Original---
From: "JingChaoLiu"<[email protected]>
Date: Mon, Jul 22, 2019 00:12 AM
To: "STVIR/PMTD"<[email protected]>;
Cc: "kapness"<[email protected]>;"Author"<[email protected]>;
Subject: Re: [STVIR/PMTD] when I use four rtx2080ti to train the maskrcnn as a baseline,the F-measure is only about 65%,is it normal? (#7)
In our training, the original Mask R-CNN indeed only achieve a F-measure of 66%. The 10% improvement in our baseline may come from: (no ablation study, no guarantee, just based on memories)
Data Augmentation +6%
OHEM +2%
Train->Test extends to Train+Validation-> Test +1%
Use the Ignore Annotation +1%
Note: the first three tricks have been elaborated in our paper. Recently,I noticed the implementation of Use the Ignore Annotation was not a part of the official implementation but from an open source repository matterport/Mask_RCNN which our private framework followed.
The main idea of Use the Ignore Annotation is when a predicted box overlaps with the groundtruth box at a high ratio, then this predicted box is labeled as ignore, in other words, neither positive nor negative. The details can be referred in build_rpn_targets of RPN and detection_targets_graph of Bbox branch. And the only difference taken from cocoapi is that the evaluation criteria, intersection / (gt_ignore_area + pred_area - intersection) < 0.001, is replaced to intersection / pred_area < 0.5 .
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
or just do random crop and set mask loss and box reg loss as 0? because on icdar15 dataset,if I only do random crop, there are too many croppped area with no GT box,and the loss becomes bad.
…---Original---
From: "JingChaoLiu"<[email protected]>
Date: Mon, Jul 22, 2019 21:29 PM
To: "STVIR/PMTD"<[email protected]>;
Cc: "kapness"<[email protected]>;"Author"<[email protected]>;
Subject: Re: [STVIR/PMTD] when I use four rtx2080ti to train the maskrcnn as a baseline,the F-measure is only about 65%,is it normal? (#7)
only compute 512 samples as RPN loss
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
when no GT after cropping (though it rarely happens), just skip any steps involving positive ROIs (bbox regression and mask generation), set the corresponding losses to 0 (just for logging) and not backward them. I guess here is a good position for ignoring these zero losses. |
By the way, all the images in ICDAR 2015 shares a same shape of 1280x720, so as mentioned in the paper, it is recommened to crop image by preserving the aspect ratio. |
How can I get a good performance on four 11G GPUS ?
The text was updated successfully, but these errors were encountered: