Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor TF Dataset #46

Open
5 tasks done
andrevitorelli opened this issue Mar 22, 2022 · 7 comments
Open
5 tasks done

Refactor TF Dataset #46

andrevitorelli opened this issue Mar 22, 2022 · 7 comments
Assignees
Labels
enhancement New feature or request

Comments

@andrevitorelli
Copy link
Member

andrevitorelli commented Mar 22, 2022

After talking with @EiffL and @b-remy, we decided to converge on a more usable format for our tfds.

We will have a simple galaxy generator dataset and one based on parametric COSMOS galaxies drawn as CFIS-like stamp images.

This issue is to discuss some design choices and track/discuss this development.

Issues/todo:

  • sample COSMOS realistic images (gal_type='parametric')
  • rescale the flux for CFIS like images
  • should we put a threshold to remove too faint galaxies?
  • do we want to add random knots for realistic images?
  • After the CFIS magnitude cut we may want to augment our dataset. What kind of augmentation? Ellipticities only or also size, SNR, etc.

What do you think about these open questions, @EiffL ?

@andrevitorelli andrevitorelli added the enhancement New feature or request label Mar 22, 2022
@EiffL
Copy link
Member

EiffL commented Mar 23, 2022

Yep :-) but I'm a bit confused, I thought we had already done all of that. I remember discussions like here: #39

@EiffL
Copy link
Member

EiffL commented Mar 23, 2022

Otherwise I want to say no to knots for now, and we will need to keep fairly faint galaxies in order to apply the selection cuts.

@EiffL
Copy link
Member

EiffL commented Mar 23, 2022

And the only form of augmentation we can make is generating the noise on the fly instead of hardcoded in the dataset

@andrevitorelli
Copy link
Member Author

Yep :-) but I'm a bit confused, I thought we had already done all of that. I remember discussions like here: #39

We did, but the current state of the module is not satisfactory, still.

And the only form of augmentation we can make is generating the noise on the fly instead of hardcoded in the dataset

This is kind of a problem. The COSMOS dataset is small, isn't it? We need more than 80k galaxy models.

@EiffL
Copy link
Member

EiffL commented Mar 23, 2022

No it's not really a problem, you can apply random rotations to galaxies, and generate independent noise realisations. At least for training a NN it's more than enough

@andrevitorelli
Copy link
Member Author

Ok, that's kind of the data aug I was talking about.

@EiffL
Copy link
Member

EiffL commented Mar 23, 2022

Yep but this can be done on the fly most likely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants