reproducing sAP evaluation #24

yanconglin · 2020-02-22T17:05:15Z

Hi, Yichao, really nice work! Thanks for sharing the code.
I would like to know how you convert the AFM reuslts (other methods as well) into the npz format which is used in the mAPJ/SAP evaluation? Can you share this part? Currently I have trouble reproducing the reported results of AFM in your paper. The numbes I got are 4.7 6.5 7.6 (sAP on ShanghaiTech), which are ~20 (sAP10) lower than the ones in the paper (Table 2). I have checked that the afm lines and gt lines before the 'lcnn.metric.msTPFP' function are matched by visualization (check the figure below), so I was wondering why the afm result I got is so bad compare with the one in paper? Looking forward to your help!

Thanks in advance!

yanconglin · 2020-02-23T14:02:53Z

Update: guess I found the reason after a day.... After checking the afm code, I found the last column of the output mat (x1,y1,x2,y2,width) is width, instead of the ratio which is used to rank lines in the paper. After replacing last column with width/length, I got new results: mAPJ 23.3 (the same as yours), sAP10: 23.9 (slightly worse than 24.4 in paper). Any ideas on why this happens?

zhou13 · 2020-02-24T09:40:44Z

If the error is relatively small (23.9 vs 24.4), I think it might be due to some subpixel error, e.g., pixels starting from 0 vs from 1 vs from 0.5. @HaozhiQi Any idea on this?

yanconglin · 2020-02-24T10:33:28Z

Another update: Currently the number I got on York is better than yours: (sAP10: 9.1 vs 3.5, mAPJ: 12.3 vs6.9). THis is abit confusing. Not sure where the problem is. Is it possible to realase all numbers (SAP 5 10 15 on both datasets for all methods)? Thanks a lot!

zhou13 · 2020-02-24T13:22:06Z

It seems that your number on York might be reasonable. I will check and get back to you later.

zhou13 · 2020-03-09T21:57:12Z

Sorry for the late reply. The code for evaluating York I am currently using is

lcnn/eval-mAPJ.py

Line 103 in fdbbf6b

def evaluate_afm(im_list, gt_list):

and

lcnn/eval-sAP.py

Line 35 in fdbbf6b

def line_score(path, threshold=5):

I check my code and plot and it seems to be right for me. The gt data I am using is at https://drive.google.com/file/d/1ijOXv0Xw1IaNDtp1uBJt5Xb3mMj99Iw2/view?usp=sharing. Let me know if you have any finding or anything more from me.

yanconglin · 2020-03-09T22:12:57Z

Cool. I will. Thanks for the reply!

zhou13 · 2020-05-16T22:35:06Z

Feel free to reopen this issue if you have any update.

yanconglin · 2020-06-11T12:50:15Z

HI, Yizhao,

Sorry to bother you again.
Recently, I spent sometiem figuring out why the Wireframe approach has such a bad performance on sAP (3%-5%), and found a problem in sAP evaluation. By printing the output of the Wireframe, you will find that there are many duplicate lines with exactly the same endpoints, except that the order of endpoints are different. This leads to a lower sAP values. Once those duplicate lines are removed, I obtained a higher sAP values (6.8 | 9.0 | 10.4 on ShanghaiTech), which seem to be more reasonable. Similar observations on the York dataset. Not sure if you have the same observation? PLease let me know your findings. Thank you!

zhou13 · 2020-06-11T20:00:34Z

Hi @yanconglin,

Thank you so much for your finding! I was also wondering the same problem for some time. I will investigate and update our results.

Best,
Yichao Zhou.

zhou13 · 2021-05-04T21:37:41Z

@yanconglin We have fixed the problem and update the results of the arXiv paper. Thanks for your investigation!

zhou13 added the question Further information is requested label Feb 24, 2020

zhou13 added bug Something isn't working question Further information is requested and removed question Further information is requested bug Something isn't working labels Mar 9, 2020

zhou13 closed this as completed May 16, 2020

zhou13 reopened this Jun 11, 2020

zhou13 added the bug Something isn't working label Jun 11, 2020

zhou13 closed this as completed May 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reproducing sAP evaluation #24

reproducing sAP evaluation #24

yanconglin commented Feb 22, 2020 •

edited

Loading

yanconglin commented Feb 23, 2020 •

edited

Loading

zhou13 commented Feb 24, 2020 •

edited

Loading

yanconglin commented Feb 24, 2020 •

edited

Loading

zhou13 commented Feb 24, 2020

zhou13 commented Mar 9, 2020 •

edited

Loading

yanconglin commented Mar 9, 2020

zhou13 commented May 16, 2020

yanconglin commented Jun 11, 2020

zhou13 commented Jun 11, 2020

zhou13 commented May 4, 2021

reproducing sAP evaluation #24

reproducing sAP evaluation #24

Comments

yanconglin commented Feb 22, 2020 • edited Loading

yanconglin commented Feb 23, 2020 • edited Loading

zhou13 commented Feb 24, 2020 • edited Loading

yanconglin commented Feb 24, 2020 • edited Loading

zhou13 commented Feb 24, 2020

zhou13 commented Mar 9, 2020 • edited Loading

yanconglin commented Mar 9, 2020

zhou13 commented May 16, 2020

yanconglin commented Jun 11, 2020

zhou13 commented Jun 11, 2020

zhou13 commented May 4, 2021

yanconglin commented Feb 22, 2020 •

edited

Loading

yanconglin commented Feb 23, 2020 •

edited

Loading

zhou13 commented Feb 24, 2020 •

edited

Loading

yanconglin commented Feb 24, 2020 •

edited

Loading

zhou13 commented Mar 9, 2020 •

edited

Loading