Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

binning.create_histogram_table side effects #259

Open
kosack opened this issue Sep 13, 2023 · 4 comments
Open

binning.create_histogram_table side effects #259

kosack opened this issue Sep 13, 2023 · 4 comments

Comments

@kosack
Copy link

kosack commented Sep 13, 2023

binning.create_histogram_table "magically" uses a "weight" column if it finds it. This is undocumented and not at all obvious. It would be better to add another parameterweight_key="weight" and mention in the docs that the histogram will be weighted by the column specified byweight_key.

The docstring also doesn't mention that it outputs more than just the histogram, but both the normal and weighted version. And even more unexpected, if it finds a particle_type column, it makes grouped histograms by particle type that also end up in the output. It would be better (and more clear) to add this also as a parameter group_key="particle_type" to make it more obvious

@kosack
Copy link
Author

kosack commented Sep 13, 2023

Just to note this caused a large bug in my test script. I had use the column name "weights" instead of "weight", but create_histogram_table just silently created a "n_weighted" column with no weights in that case. It took me a while to find out why the weighted and un-weigted events were identical.

So additionally, if there is a weight_key option, it should fail if that column doesn't exist, or at the very least, not generate the n_weighted column

@maxnoe
Copy link
Member

maxnoe commented Sep 13, 2023

These are good suggestions, and I'll implement them.

This is undocumented and not at all obvious.

It is in the introduction:

https://pyirf.readthedocs.io/en/latest/introduction.html#dl2-event-lists

@maxnoe maxnoe closed this as completed Sep 13, 2023
@maxnoe maxnoe reopened this Sep 13, 2023
@kosack
Copy link
Author

kosack commented Sep 14, 2023

The intro does help (I had missed it!), but I still think hidden assumptions should be more explicit. I had the same issue with some of the other API functions. Having keywords for the column name helps a lot to understand what is happening underneath. The defaults can match the introduction, but allowing a user to use an API function without having to remap columns is also nice, especially as it allows for easier comparisons between reconstructors for example.

@maxnoe
Copy link
Member

maxnoe commented Sep 14, 2023

yes, agreed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants