Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of logits instead of softmax activations for OS scoring #7

Closed
agaldran opened this issue Jan 6, 2022 · 4 comments
Closed

Use of logits instead of softmax activations for OS scoring #7

agaldran opened this issue Jan 6, 2022 · 4 comments

Comments

@agaldran
Copy link

agaldran commented Jan 6, 2022

Hi again,

I read from the paper that "[...] we propose the use of the maximum logit rather than softmax probability for the open-set scoring rule. Logits are the raw outputs of the final linear layer in a deep classifier, while the softmax operation involves a normalization such that the outputs can be interpreted as a probability vector summing to one. As the softmax operation normalizes out much of the feature magnitude information present in the logits, we find logits lead to better open-set detection results" . Then you have figure 6c that shows AUROC on the test set(s) and how it evolves as training goes on, using both max-logits and max-softmax for scoring, showing how it might be better to use max-of-logits.

However, the ARPL code for the Softmax loss (found here), which you are inheriting and using for testing, is a bit weird: it calls logits to the post-softmax activation, see here.

Since you are taking the (false) logits from calling the criterion (here) during testing, and then you have a few lines below the option of (re-)applying softmax to them if we are running with 'use_softmax_in_eval, I am wondering if what you are calling in your experiments from the paper "logits" are actually softmax(logits), and what you call softmax activations are indeed softmax(softmax(logits))?

Thanks!

Adrian

@sgvaze
Copy link
Owner

sgvaze commented Jan 7, 2022

Hi, thanks for pointing out this bug!

Fig 6 was generated using a separate script which controlled properly for the Softmax operation. However, we cannot now remember if the main results (Tab 1) were evaluated with openset_test or with scraped results. We are looking into this now!

@agaldran
Copy link
Author

agaldran commented Jan 8, 2022

Hi,

Right, thanks a lot for being so responsive and taking care of this. In the meantime, can I ask another question about evaluation? Namely, about the data/open_set_datasets.get_datasets() function. I see that the default for the balance_open_set_eval parameter in that function is set to False, but then in training when you define the datasets you call this function giving it a value of True. On the other hand, in the methods/tests/openset_test.py script, you seem to be calling it with a value of False.

Since you get your results from reading them off the logs after training, I just wanted to double-check if you are indeed rebalancing the test set in some sense, since I don't recall reading this in the paper? And, what exactly would be the way in which you balance the test set?

Also, do you know if the other papers for which you report results in your paper (apart of ARPL, which you re-train and re-test) use this kind of rebalancing?

Thanks!!

Adrian

@sgvaze
Copy link
Owner

sgvaze commented Jan 17, 2022

Hi! Thanks for your patience.

We re-trained the models with the implementation from this repo and got the following results. The numbers are slightly boosted and we will update the ArXiv paper when we release the next version. All models are evaluated using openset_test, with balance_open_set_eval=False.

Regarding previous papers, they often don't specify but I imagine they do not perform the rebalancing (it does not change the numbers too much as AUROC is relatively robust to misbalance).

Dataset MNIST SVHN CIFAR10 CIFAR + 10 CIFAR + 50 TinyImageNet
Softmax+ 98.6 96.0 90.1 95.6 94.0 82.7
Logit+ 99.3 97.1 93.6 97.9 96.5 83.0
(ARPL + CS)+ 99.2 96.8 93.9 98.1 96.7 82.5

@agaldran
Copy link
Author

Hey Sagar!

Congratulations on improving your results, your work in this paper is indeed a impressive, I appreciate it a lot.

I am now trying to reproduce now your results on fine-grained datasets, I will probably be writing you again if I find some difficulties, I hope that's fine for you!

Thanks,

Adrian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants