Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calir3 ignores the bed file supported by --bed_fn #12

Closed
MeHelmy opened this issue May 19, 2021 · 13 comments · Fixed by #14
Closed

Calir3 ignores the bed file supported by --bed_fn #12

MeHelmy opened this issue May 19, 2021 · 13 comments · Fixed by #14
Assignees
Labels
enhancement New feature or request

Comments

@MeHelmy
Copy link

MeHelmy commented May 19, 2021

Calling variant using bed file:

run_clair3.sh --bam_fn ${HOME}/clair3/HG002.ONT.bam --ref_fn /${HOME}/hs37d5_mainchr.fa --threads 10 --platform ont --${HOME}/Clair3/modules/ont --output ${HOME}/chr22 --bed_fn ${HOME}/chr22.bed

cat chr22.bed:
22 1 10000000

The issue is Clair3 begin to call variant for chr1 !!

I take a look at chr22/run_clair3.log

Clair3/scripts/clair3.sh --bam_fn ${HOME}/HG002.ONT.bam --ref_fn /users/mmahmoud/home/public_workplace/scripts/snakefiles/test/hs37d5_mainchr.fa --threads 10 --model_path /users/mmahmoud/home/projects/princess/Clair3/modules/ont --platform ont --output ${HOME}/chr22 --bed_fn= --vcf_fn=EMPTY --ctg_name=EMPTY --sample_name=EMPTY --chunk_num=0 --chunk_size=5000000 --samtools=samtools --python=python3 --pypy=pypy3 --parallel=parallel --whatshap=whatshap --qual=0 --var_pct_full=0.3 --ref_pct_full=0.3 --snp_min_af=0.0 --indel_min_af=0.0 --pileup_only=False --gvcf=False --fast_mode=False --print_ref_calls=False --haploid_precise=False --haploid_sensitive=False --include_all_ctgs=False --no_phasing_for_fa=False

As you can see, --bed_fn= is empty.
Additionally, we have this warning scripts/clair3.sh: line 58: [: =: unary operator expected

Best,
Medhat

@fritzsedlazeck
Copy link

thanks guys we also tried to set the chr to 22 itself but that also was ignored. Let us know what to do .
Thanks
Fritz

@aquaskyline
Copy link
Member

In your command, please use --bed_fn=${HOME}/chr22.bed instead of --bed_fn ${HOME}/chr22.bed. --bed_fn is an optional parameter and while we have used getopt in the bash shell, a = is required for all optional parameters, while both and = are allowed in required parameters.

We are updating the help info in run_clair3.sh to make it clear, and also adding some checkpoints.

@fritzsedlazeck
Copy link

Thanks . We will give it a try.
Fritz

@aquaskyline aquaskyline added the enhancement New feature or request label May 19, 2021
@MeHelmy
Copy link
Author

MeHelmy commented May 19, 2021

Thank you,

I used the suggestion and I run into a different error:

[INFO] Check envrionment variables
[INFO] --include_all_ctgs not enabled, use chr{1..22,X,Y} and {1..22,X,Y} by default
[INFO] Call variant in contigs: 22
[INFO] Chunk number for each contig: 11
[INFO] Create folder clair3/chr22/tmp/split_beds
[INFO] 1/7 Calling variants using pileup model
parallel: Error: Command line too long (907 >= 0) at input 0: 22 1 11

real    0m0.916s
user    0m0.209s
sys     0m0.053s
[INFO] Merge chunked contigs vcf files
[INFO] 2/7 Filter Hete SNP varaints for Whatshap phasing and haplotag
[INFO] Select phasing quality cut off 16
parallel: Error: Command line too long (288 >= 0) at input 0: 22

real    0m0.199s
user    0m0.154s
sys     0m0.041s
[INFO] 3/7 Whatshap phase vcf file
parallel: Error: Command line too long (462 >= 0) at input 0: 22

real    0m0.191s
user    0m0.164s
sys     0m0.031s
parallel: Error: Command line too long (108 >= 0) at input 0: 22

Thanks,
Medhat

@aquaskyline
Copy link
Member

checking into the problem.

@aquaskyline
Copy link
Member

Hi Madhat,

Your parallel seems to be working incorrectly. In the following errors, they show parallel is allowing 0 parameters >=0. I've tested the parallel on my side, parallel --max-line-length-allowed shows 131049, and any commands longer than that will give an error like parallel: Error: Command line too long (223235 >= 131049). If you are using conda, could you check the parallel version, or reinstall it. We are using parallel=20191122.

parallel: Error: Command line too long (907 >= 0) at input 0: 22 1 11
parallel: Error: Command line too long (288 >= 0) at input 0: 22
parallel: Error: Command line too long (462 >= 0) at input 0: 22
parallel: Error: Command line too long (108 >= 0) at input 0: 22

Thanks,
Laurent

@MeHelmy
Copy link
Author

MeHelmy commented May 20, 2021

Thank you,
I'm using GNU parallel 20191122 it comes with the conda installation for Clair3.

conda install -c conda-forge parallel=20191122 zstd=1.4.4 -y

Thanks,
Medhat

@aquaskyline
Copy link
Member

What's the output of running parallel --max-line-length-allowed in your env?

@MeHelmy
Copy link
Author

MeHelmy commented May 20, 2021

131049

@aquaskyline
Copy link
Member

could you try passing the absolute path of your parallel to Clair3 using --parallel=/PATH/to/parallel.

@MeHelmy
Copy link
Author

MeHelmy commented May 20, 2021

run_clair3.sh --bam_fn ${HOME}/clair3/HG002.ONT.bam --ref_fn ${HOME}/hs37d5_mainchr.fa --threads 10 --platform ont --model_path ${HOME}/Clair3/modules/ont --output${HOME}/clair3/chr22 --bed_fn=${HOME}/clair3/chr22.bed --parallel=/users/.../envs/clair3/bin/parallel

The bed file is:
22 1 10000000

I got this error:

[INFO] Check envrionment variables
[INFO] --include_all_ctgs not enabled, use chr{1..22,X,Y} and {1..22,X,Y} by default
[INFO] Call variant in contigs: 22
[INFO] Chunk number for each contig: 11
[INFO] Create folder /users/.../clair3/chr22/log
[INFO] Create folder /users/.../clair3/chr22/tmp
[INFO] Create folder /users/.../clair3/chr22/tmp/split_beds
[INFO] Create folder /users/.../clair3/chr22/tmp/pileup_output
[INFO] Create folder /users/.../clair3/chr22/tmp/merge_output
[INFO] Create folder /users/.../clair3/chr22/tmp/phase_output
[INFO] Create folder /users/.../clair3/chr22/tmp/gvcf_tmp_output
[INFO] Create folder /users/.../clair3/chr22/tmp/full_alignment_output
[INFO] Create folder /users/.../clair3/chr22/tmp/phase_output/phase_vcf
[INFO] Create folder /users/.../clair3/chr22/tmp/phase_output/phase_bam
[INFO] Create folder /users/.../clair3/chr22/tmp/full_alignment_output/candidate_bed
[INFO] 1/7 Calling variants using pileup model
[INFO] Delay 0 seconds before starting variant calling ...
[bed_read] Parse error reading "/users/.../chr22/tmp/split_beds/22" at line 1 : end (10000033) must not be less than start (18446744073709551584)
samtools mpileup: Could not read file "/users/.../chr22/tmp/split_beds/22"
[INFO] Delay 0 seconds before starting variant calling ...

real	0m15.696s
user	0m17.529s
sys	0m5.944s
[INFO] Merge chunked contigs vcf files
cat: /users/.../clair3/chr22/tmp/pileup_output/pileup_*.vcf: No such file or directory
[bgzip] No such file or directory: /users/.../clair3/chr22/pileup.vcf

Thanks,
Medhat

@aquaskyline
Copy link
Member

confirmed to be a bug in Clair3. will fix it in the next release, as a fast workaround, you could modify 22 1 1000000 to 22 34 1000000 to proceed with the further steps.

@aquaskyline
Copy link
Member

"Out of range" problem fixed in v0.1-r2. Additional boundary checks are also added in the update.

@aquaskyline aquaskyline linked a pull request May 23, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants