Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to run find-sites #121

Closed
johanneskoester opened this issue Aug 27, 2023 · 10 comments
Closed

Failing to run find-sites #121

johanneskoester opened this issue Aug 27, 2023 · 10 comments

Comments

@johanneskoester
Copy link
Contributor

johanneskoester commented Aug 27, 2023

I get the following error

$ somalier find-sites
...
[somalier] af not found, using 0
[somalier] af not found, using 0
[somalier] af not found, using 0
[somalier] af not found, using 0
[somalier] af not found, using 0
[somalier] af not found, using 0
fatal.nim(49)            sysFatal
Error: unhandled exception: index out of bounds, the container is empty [IndexDefect]

The test file is too large to upload here, but I am happy to send it via gdrive if you need it.

@johanneskoester
Copy link
Contributor Author

It contains the AF tag, but not for all records. The error happens after many records have been processed.

@brentp
Copy link
Owner

brentp commented Aug 28, 2023

Hi @johanneskoester , would you run with the attached debug binary and let me know the error?
somalier_debug.gz
find-sites is less widely used so you're likely hitting something I haven't considered.

Do note that there is a --min-AN argument which defaults to 115000. You may need to lower that if you have a smaller cohort.

@johanneskoester
Copy link
Contributor Author

/home/brentp/src/somalier/src/somalier.nim(276) somalier
/home/brentp/src/somalier/src/somalier.nim(263) main
/home/brentp/src/somalier/src/somalierpkg/findsites.nim(162) findsites_main
/nim-1.6.6/lib/system/fatal.nim(53) sysFatal
Error: unhandled exception: index out of bounds, the container is empty [IndexDefect]

@johanneskoester
Copy link
Contributor Author

johanneskoester commented Aug 28, 2023

The --min-AN has no influence, but looking at the help, maybe my input VCF does not satisfy the requirements. It has the AF field, but e.g. no samples. It is the "known variation VCF" from ensembl.org (https://ftp.ensembl.org/pub/release-110/variation/vcf/homo_sapiens/, merged together those individual chromosome files) that I have modified with bcftools annotate in order to rename the MAF field into AF (bcftools annotate -c INFO/AF:=INFO/MAF).

@brentp
Copy link
Owner

brentp commented Aug 28, 2023

ok. I see the problem, it's a classic :( . I am checking variant.ALT[0] and you have a variant without an alternate allele. I will check for this.

@brentp
Copy link
Owner

brentp commented Aug 28, 2023

Here is a debug build with a fix for that if you'd like to try it.
somalier_debug.gz

@brentp
Copy link
Owner

brentp commented Aug 28, 2023

I will also run it on chr1 from your link and assure that it works

@brentp
Copy link
Owner

brentp commented Aug 28, 2023

I run:

/somalier_debug find-sites --AF-field MAF homo_sapiens-chr1.vcf.gz --min-AN 0

and see:

[somalier] af not found, using 0 # many times!!!
121649 candidate variants
sorted and filtered to 14385 autosomal variants. now dropping INFOs and writing
[somalier] wrote 14385 variants to:sites.vcf.gz

So I think that change should resolve your issue. I'll make a new release and try to reduce the number of times we see that message.

@brentp brentp closed this as completed in 8431b6f Aug 28, 2023
@brentp
Copy link
Owner

brentp commented Aug 28, 2023

This is out in v0.2.18: https://github.com/brentp/somalier/releases/tag/v0.2.18

thanks for reporting and let me know if you have any more issues.

@johanneskoester
Copy link
Contributor Author

Thanks a lot!!! Super quick!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants