-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Simphenotype and Index Repeat Support #209
Conversation
… feat/simphenotypeTR
… feat/simphenotypeTR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks great! Thanks for taking all of this on, @mlamkin7
I think this will be super useful for me and for others in the lab! I can't wait to use it for happler
Most of my comments are for refactoring things a bit to make it easier to add new effects to simphenotype in the future, but we don't necessarily need to try to do that all now either
Also, do we have a test to check whether things will still work if you specify a mix of repeat and haplotype IDs via simphenotype
's --ids
parameter?
We've updated the
simphenotype
andindex
subcommands to support a new line type in the hap file "R".R stands for repeats
Usage in a sorted hap file (tests/data/basic.hap.gz):
Along with these changes are additional changes in simphenotypes PhenoSimulator class particularly the run() function which now instead of taking in a list of haplotypes takes in the full Haplotypes object as well as the IDs of haplotypes and repeats to extract betas and genotypes.
To use repeats in simphenotype, use the additional
--repeats
option.Example:
Note in the example SNPs must also still be present, so we cannot simulate based on repeats alone.