Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

preserve decimal point in float INFO fields #980

Open
pontikos opened this issue Mar 12, 2019 · 5 comments
Open

preserve decimal point in float INFO fields #980

pontikos opened this issue Mar 12, 2019 · 5 comments

Comments

@pontikos
Copy link

INFO fields of type float should have a decimal point even if the number has trailing zeroes
I.e 70.0 instead of 70.
Rounding to an integer breaks GATK.

@jkbonfield
Copy link
Contributor

This has come up before, although I'm struggling to find the issue. Maybe it was over in htsjdk land.

Anyway, this is a parsing bug in GATK, not in bcftools output. Floating point numbers are a superset of integers. "70" is still a valid floating point number and C "atof" and "strtod" functions quite happily accept whole numbers.

While I guess we could change all floating point numbers to include .0 if they are whole numbers, it needlessly wastes space and isn't the correct solution.

@pontikos
Copy link
Author

Ok I've posted on GATK github:

broadinstitute/gatk#5789

I agree that it seems silly that GATK falls over when a decimal point is missing for a float.

I hope htsjdk (assuming that's what GATK are using) and htslib can agree on this.

@pd3
Copy link
Member

pd3 commented Mar 13, 2019

Yes, this is a silly bug in GATK and we will not address this in bcftools / htslib. As a workaround, you can "fix" the numbers to GATK's liking using this script https://github.com/samtools/bcftools/blob/develop/misc/fix-broken-GATK-Double-vs-Integer

@pontikos
Copy link
Author

pontikos commented Mar 13, 2019 via email

@jkbonfield
Copy link
Contributor

jkbonfield commented Mar 13, 2019

It's probably kputd in kstring.c. This uses %g to print up floats if very large or very small, or otherwise emulates the printf %g format itself. The z[-1] = 0 line MAY be responsible along with some editing to the trailing zero removal, but you'll need to experiment. Note though this is just following normal printing mechanism. Eg try printf on the command line:

jkb$ printf "%g\n" 0.170
0.17
jkb$ printf "%g\n" 1.70
1.7
jkb$ printf "%g\n" 17.0
17

"17", not "17.0"!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants