Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New data generators #34

Closed
wants to merge 71 commits into from
Closed

New data generators #34

wants to merge 71 commits into from

Conversation

andrevitorelli
Copy link
Member

I am trying to generate a more realistic set of galaxy stamps to test autometacal. These methods will be available inside the datasets subpackage and eventually integrated into tensorflow-datasets api within autometacal.

I have created a notebook at notebooks/Datasets.ipynb to explore different ways to do so.

Currently, I have implementations of:

  1. A simple exponential profile convolved with a moffat PSF
  2. Creating parametric galaxies from the COSMOS dataset
  3. Cutting out galaxy stamps directly from the COSMOS dataset

Both 2. and 3. have problems for sure, which is why I'm seeking some guidance from @aguinot also. I'm trying to improve 2. getting code from galaxy2galaxy now.

PS: I have also tried to do bulge+disk models (with some reasonable assumptions) by modifying 1., and tested them as I was doing before - the results were comparable to 1.

gal_image = obs.drawImage(nx=defaults['stamp_size'],
ny=defaults['stamp_size'],
scale=defaults['scale'])
noise = galsim.GaussianNoise()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be nice to give an instance of galsim.BaseDeviate() here, with you would initialize with a seed that you take as input. That would help the reproducibility.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I do plan to do this next.

}

disk_frac = 1 - bulge_frac
smooth_disk_frac = 1 - knot_disk_frac
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some ref/explanation would be nice here as well.

disk_hlr = .7,
bulge_frac = 0.4,
knot_disk_frac = 0.7,
n_knots = 100,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a very nice addition. Do you have a ref or a justification for those numbers ?

knotted_disk = galsim.RandomKnots(n_knots,
half_light_radius=disk_hlr,
flux=knot_disk_frac,)
#rng=rng)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would definitely not remove the rng initialization. It is something that you really want to do.

scale=defaults['scale'])
gal_image = obj.drawImage(nx=defaults['stamp_size'],
ny=defaults['stamp_size'],
scale=defaults['scale'])
noise = galsim.GaussianNoise()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, seed.


q1 = Q11 - Q22
q2 = 2*Q12
T = Q11 + Q22 + 2*tf.math.sqrt(tf.math.abs(Q11*Q22-Q12*Q12))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big fan of this definition.. Having the sqrt is risky. I am pretty sure that ngmix doesn't use this definition either.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had changed to the ngmix definition (T = Q11 + Q22), but since GalFlow uses this one, I switched it back.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hummmmm..... it's not that GalFlow uses this one, it's just that in GalFlow we apply a shear g, which is different than e. I've proposed reverting back to T = Q11 + Q22 in #37

q1 = Q11 - Q22
q2 = 2*Q12
T = Q11 + Q22 + 2*tf.math.sqrt(tf.math.abs(Q11*Q22-Q12*Q12))
result = tf.stack([q1/T, q2/T], axis=-1)[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the size (T) in the output could be very nice!
Maybe we should use another definition of the size but T is fine for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's true, but we can make another PR for that, this is out of the scope of this one.

gals, _ = autometacal.datasets.galaxies.make_data(Ngals=Ngals, snr=100,
gal_g1=np.random.uniform(-.7,.7,Ngals),
gal_g2=np.random.uniform(-.7,.7,Ngals),
scale=scale)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, you should really have a seed set here (for np.random but also inside make_data for the noise in GalSim. By the way, since you are using gaussian noise, you could do it with NumPy so you only have to initialize NumPy). It is very important that each time we run this test we get the exact same result. Also, it would be nice to know the SNR used here for those tests.

weights = autometacal.tf_ngmix.create_gmix([0.,0.,0.,0.,T,1.],'gauss')
result_tf_ngmix = autometacal.tf_ngmix.get_moments(weights,pixels)

assert_allclose(results_ngmix,result_tf_ngmix,rtol=1e-6,atol=1e-6)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on your definition of ellipticity (different from ngmix), I am very surprise that this test pass. Maybe if you use a gaussian profile both definition (chi/epsilon) are equivalent... This would need to be investigated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is missing an update from the tf_ngmix branch, I changed it by first applying e1e2_to_g1g2() from ngmix before comparison.

def test_tf_ngmix():
"""
This test generates a simple galaxy and measure moments with ngmix, vs.
tf_ngmix.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but those tests are important in my opinion, and they would required a bit more explanation. Like what is a "simple galaxy"? (SNR as well)

@aguinot
Copy link
Collaborator

aguinot commented Nov 5, 2021

In addition to the comment I let on the code, here are some comments on the notebooks:

  • Dataset.ipynb:
    • Why you do not use the function in galaxy.py?
    • Also we should discuss about the tests we want to make because having a fix SNR and a flux that vary is not very realistic. Basically, if the SNR is fixed varying the flux will not change anything in the results I think.
    • In Real Images - I I feel like some galaxies looks "weird", maybe it is juste me.. The noise looks strange as well.
    • In Real Images II I feel like there is some patterns in the noise?
  • metacal_comparison.ipynb:
    • In Test GalFlow Deconv/Reconv are the residual actually small? what is the maximum pixel value in your original image? Because if the flux is 1, like it is in your default, this is actually not that "small" a few percent. But maybe it is expected?
    • In GalFlow vs GalSim same comment ^^^^^
    • I don't understand those plots: reconv w/ same psf noshear
    • In Simple Ellipticity Measurements I don't like what I see here. The value you measure are "far" from the expected one. Maybe this is because of the step size, could you re-try with step=0.01?
    • When you print the calibrated e1/e2 it could be nice to have an idea of the variance to see if the measured residuals are significant or not. If not, then we should increase the number of galaxies (or increase the SNR). To give you an idea in the unit test used to check metacal in the ngmix repo they use 1000 galaxies with SNR=80000.
    • I am still confuse by the discrepancy between R11/R22. I would be curious to see what ngmix gives you on the same sample.
    • In Averaging the response comparison over many galaxies you should fix the plot, we cannot see what is going on. Also, I am very surprise by the value you have (-350), this should not happen given your dataset.
    • The end of your code failed

Some general comments:

  • It is hard to check all the changes, there is a lot of files
  • Some of the files are more tan a month old
  • Some clean up would be nice :)
  • Globally I think it is pretty nice and we are getting closer, it is very nice to see!

'scale' : 0.2,
'stamp_size' : 51,
'psf_fwhm' : 0.9,
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be set as a "global" variable since it is used in all of your functions

@EiffL
Copy link
Member

EiffL commented Nov 6, 2021

I've updated this branch to main.

@andrevitorelli
Copy link
Member Author

This PR should be dropped as #40 covers it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants