-
Notifications
You must be signed in to change notification settings - Fork 557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use SAM headers from index directory's <prefix>.hdr
if it exists
#348
base: master
Are you sure you want to change the base?
Conversation
What if it just uses a .dict file by default when present? while .hdr is more flexible, .dict files are already in use. |
The contents (a bunch of SAM headers) would be much the same, just the filename would be different. I considered using the existing
(You can always use |
Are there types of lines that should be disallowed? In (1), it looks like the intention is that the file should live statically alongside the other bwa files. So RG, and perhaps even PG, should not be static. RG is obvious as to why, but PG being in the file would be misleading if we use a newer version of bwa with older and compatible index folds. Given .dict files are found a lot in the wild, and can be generated by multiple tools, how about they get used if the .hdr file does not exist? I find having yet another auxiliary file I have to maintain when .dict already exists confusing and cumbersome. If folks want a custom header, they can just specify it similar to your .dict example with the -H option. And is this file generated by bwa, then hand modified to add additional lines, or even add AH fields to SQ lines? If so, then should it really be codified as a bwa generated static file? I don’t think so. |
This file is a dict file, just with a slightly different purpose, so the different name adds some flexibility. But if @lh3 prefers, I would add the code to strip a suffix and add For the time being, maybe let's stop quibbling about the filename so that we can hear the BWA maintainer's thoughts about the proposed feature 😄 |
@jmarshall open to whatever works here |
Introduce a <prefix>.hdr file alongside the other index files, useful for setting up detailed `@SQ` headers etc to be used by default when mapping. Output SAM headers from: -H options; .hdr file stored with the index files; the basic hard-coded `@HD`/`@SQ` default headers. An `@HD` header is always printed, with -H overriding one from <prefix>.hdr (if any), which overrides the default one. A set of `@SQ` headers is always printed, with -H options overriding <prefix>.hdr overriding the default, similarly. Other headers specified in either -H or <prefix>.hdr are all output. If -H includes `@SQ` headers, we consider that the user has carefully set up all headers to be output and ignore <prefix>.hdr entirely. Add descriptions of <prefix>.alt (brief) and <prefix>.hdr (complete) to the manual page. This <prefix>.hdr file is implemented for mem, bwase, and bwape. It is not implemented for bwasw, which is deprecated.
Now outputs SAM headers from: -H options; .hdr file stored with the index files if present, or otherwise a .dict file if that is present; the basic hard-coded `@HD`/`@SQ` default headers.
@nh13: I've updated it to accept either:— hence it reads headers from The string munching required to get from |
Introduce a
<prefix>.hdr
file that the sysadmin or user should place alongside the other index files, useful for setting up detailed@SQ
headers etc to be used by default when mapping.The idea is to make it easy to have rich
@SQ
headers (and other headers if desired) in bwa mapped output. Rather than needing to use-H
every time, you can set up a<prefix>.hdr
file at the same time as you generate the bwa index. In particular, this can be used to add descriptive fields such as MD5, species, assembly, and alternative names (which enables querying for e.g.chr1
or1
interchangeably; see samtools/hts-specs#100 and samtools/htslib#931).This PR adds to the
mem
/samse
/sampe
commands so that they output SAM headers from:-H
options;.hdr
file stored with the index files;@HD
/@SQ
default headers.An
@HD
header is always printed, with an@HD
line from-H
overriding one from<prefix>.hdr
(if any), which overrides the default one added since #336. A set of@SQ
headers is always printed, with-H
options overriding<prefix>.hdr
overriding the default, similarly.Other headers specified in either
-H
or<prefix>.hdr
are all printed.If
-H
includes@SQ
headers, we consider that the user has carefully set up all the headers to be output and ignore<prefix>.hdr
entirely.Also adds a brief description of
<prefix>.alt
and a full description of<prefix>.hdr
to the manual page.