Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pk1d diff noise estimators #997

Merged
merged 5 commits into from
Jun 1, 2023
Merged

Pk1d diff noise estimators #997

merged 5 commits into from
Jun 1, 2023

Conversation

corentinravoux
Copy link
Contributor

Short pull request implementing the latest tests I realized on the diff estimator for EDR. We forgot to report them from the divergent Pk1d_DESI_changes branch.
I tested different methods and I put the one which is now used for the EDR paper as default.

Copy link
Collaborator

@iprafols iprafols left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me. Most of the changes are just formatting changes. If I understood it correctly, the only algorithmic changes are those of exp_dif_desi. Maybe someone more expert on Pk1d can review it to make sure it's ok

Copy link
Contributor

@Waelthus Waelthus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The addition of the different methods used is great!
It would've been nice to start on an up-to-date version of the file when implementing this, now we'd either need to go back to old variable names or need to fix the variable names in the PR to match conventions.
There's some suggestions regarding re-ordering, potentially removing some of the for-loops to shorten the code overhead, and some questions regarding the method used further up.

mask_targetid,
method_alpha="desi_array",
use_only_even=False,
):
"""Compute the difference between exposures.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the docstring to the new arguments. If file is actually an hdu-list, pls rename back to hdul

flux_total_even += flexp * ivexp
ivar_total_even += ivexp
teff_even += teff_lya_exp
argsort = np.flip(np.argsort(np.mean(file["IV"][mask_targetid], axis=1)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this line even used? I think you're redefining the var later before first usage.

ivar_mean = np.mean(file["IV"][mask_targetid][:, :], axis=1)
argmin_ivar = np.argmin(ivar_mean)
argsort = np.arange(ivar_mean.size)
argsort[-1], argsort[argmin_ivar] = argsort[argmin_ivar], argsort[-1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only moves the minimal ivar to the end (and makes the variable name confusing as it's not actually sorting the array, pls use a different name), is this a useful way to deal with things?
I.e. why not just do an actual argsort as in line 112 and use that one? Is it because the even indices are then always smaller than the odd and might there be a better way to deal with this than keeping the order of observations except for moving the worst to the end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this was some tests probably, removed


n_exp = len(flux)
if n_exp < 2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be done at the very beginning of the routine to avoid computations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the first time we use the number of exp

t_even = 0
t_odd = 0
t_exp = np.sum(teff_lya)
for iexp in range(2 * (n_exp // 2)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

names in the previous version where closer to the picca convention of speaking variable names, e.g. index_exp instead of iexp, num_exp for nexp,...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename in the last commit

fltotodd[w_odd] /= ivtotodd[w_odd]
w_even = ivtoteven > 0
fltoteven[w_even] /= ivtoteven[w_even]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use the previous version for most things before this point as it is unchanged (except for the ivtot summation loop) and has the better variable naming convention

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w_odd and w_even needed later

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the comment was more about notation...

fltoteven += flexp * ivexp
ivtoteven += ivexp
t_even += teff_lya_exp
for iexp in range(n_exp):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be replaced by iv_total=ivar[:n_exp].sum(axis=0). I guess similarly the loop above could be replaced by things like this:

even_inds=slice(0,2 * (n_exp // 2),2)
odd_inds=slice(1,2 * (n_exp // 2),2)
fltotodd=(flux[odd_inds]*ivar[odd_inds]).sum(axis=0)
fltoteven=(flux[odd_inds]*ivar[even_inds]).sum(axis=0)

etc. Which would also remove all the array creation above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

if use_only_even:
if n_exp % 2 == 1:
print("Odd number of exposures discarded")
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to discard odd numbers of exposures completely here? Or just discard the worst exposure? Atm you're doing the former...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can think about those details at a later time...

elif method_alpha == "desi_time":
alpha = 2 * np.sqrt((t_odd * t_even) / (t_exp * (t_odd + t_even)))

diff = 0.5 * (fltoteven - fltotodd) * alpha
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the section regarding the different methods, which of these matches the implementation in the previous version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None, the previous version was just a test

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and that test is not worth keeping even for bw-compatibility reasons, you say?

with_correction=False,
fiberid=None,
log_lambda=None):
def spectral_resolution(wdisp, with_correction=False, fiberid=None, log_lambda=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would generally be nice to split functional changes from aestetic changes, I think both versions are ok here, but the old one is a little more legible...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@Waelthus
Copy link
Contributor

Waelthus commented Jun 1, 2023

@corentinravoux: Any news on this?

@corentinravoux
Copy link
Contributor Author

I forgot this branch, I have make appropriate changes

Copy link
Contributor

@Waelthus Waelthus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ok from my side, @corentinravoux: please look at the comments regarding use_only_even and the definition of the slices.
I think it's fine to do it as is, but personally would never set the use_only_even as that discards spectra, not some last exposure, which is done automatically without the flag. If we'd actually want to use all exposures for the odd case we'd need to add functionality in a later PR.

Merging this now!

if use_only_even:
if n_exp % 2 == 1:
print("Odd number of exposures discarded")
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can think about those details at a later time...

time_exp = np.sum(teff_lya)

even_inds = slice(0, 2 * (num_exp // 2), 2)
odd_inds = slice(1, 2 * (num_exp // 2), 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that as written this will automatically discard the last exposure if an odd total number of exposures is there, so potentially we wouldn't need the use_only_even above at all (except that that currently does remove objects not exposures)...
If we actually wanted to include the case of odd-exposures fully, we'd need to use even_inds = slice(0, 2*(num_exp//2) + 1, 2)

fltotodd[w_odd] /= ivtotodd[w_odd]
w_even = ivtoteven > 0
fltoteven[w_even] /= ivtoteven[w_even]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the comment was more about notation...

@Waelthus Waelthus merged commit 0fb0717 into master Jun 1, 2023
@Waelthus Waelthus deleted the pk1d_diff_noise_estimators branch June 2, 2023 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants