-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for comments: should PEDIGREE VCF header lines be required to have an ID field? #96
Comments
I think we may be able to accomplish the change in a way that is backward compatible (or at least more so). Specifically, we could change the pedigree tag to look like this: a) To indicate clonal relationship (e.g. between tumor and germline sample)
[or, to specify two samples derived from same germline]
b) To specify a family relationship for diploids, like humans:
c) And in the arbitrary case,would become:
We should be able to describe any relationship and should probably make clear that, to fully specify the pedigree, the VCF can define relationships between IDs that are not present in the VCF (but are ancestors of others that are). So, in short, the basic idea is to replace Name_0 (typically, Child or Derived) in the original definition with ID. Since each Child or Derived sample ID should be unique, this should be seamless. Goncalo |
@abecasis I like this, +1 from me |
Plus minor change of wording in ALT * description, "overlapping" rather than "upstream" deletion.
As there were no other comments, I made the change and will close the issue now. |
Sorry to comment on an old closed issue. From my reading of the specification, the pedigree meta lines record "relationships between genomes", and as there is "a distinction between sample and genome" supported by the sample meta line, I understood the IDs used in pedigree meta lines to refer to Genomes:
I.e. This change still refers to genomes in the text but the examples appear to be using sample IDs instead, Plus, if "the VCF can define relationships between IDs that are not present in the VCF" that should be mentioned in the specification. |
When the input is VCF 4.2, this allows the `Child` or `Derived` field to act as the record ID in the value collection. See samtools/hts-specs#96 for the reasoning behind this definition. Closes #201 and closes #202.
The proposed VCF 4.3 spec (#88) mandates that "All structured lines that have their value enclosed within ”<>” require an ID which must be unique within their type." This new requirement necessitates a change to ##PEDIGREE header lines, which now require an ID. Eg.,
Since this breaks backwards compatibility for the sake of consistency within the spec, are there any objections to making this change?
The text was updated successfully, but these errors were encountered: