Skip to content
This repository has been archived by the owner on Oct 28, 2022. It is now read-only.

Added quality and filter fields to GAVariant. #220

Closed
wants to merge 2 commits into from

Conversation

johntbates
Copy link

#219 pointed out we were missing two fields that were present in the VCF specification.

@richarddurbin
Copy link
Contributor

Yes, very good to add these.
I wonder whether filter should be renamed to filtersFailed, and an empty filtersFailed list should imply PASS.
Also, I think that the description for quality should be tightened up. I suggest to add something like:

Standardly, the value is -10 log_10 p, where p is the probability of the Variant being a real genetic variant given the data, under a probablistic model for the data given the variant and a population genetic model for genetic variation.

@haussler
Copy link

+1

On Mon, Jan 12, 2015 at 10:53 AM, Richard Durbin notifications@github.com
wrote:

Yes, very good to add these.
I wonder whether filter should be renamed to filtersFailed, and an empty
filtersFailed list should imply PASS.
Also, I think that the description for quality should be tightened up. I
suggest to add something like:

Standardly, the value is -10 log_10 p, where p is the probability of the
Variant being a real genetic variant given the data, under a probablistic
model for the data given the variant and a population genetic model for
genetic variation.


Reply to this email directly or view it on GitHub
#220 (comment).

@johntbates
Copy link
Author

I generally agree, but according to the VCF 4.2 specification, a single value of "PASS" indicates that all filters have passed - see https://samtools.github.io/hts-specs/VCFv4.2.pdf sec 1.4.1 Fixed fields, heading 7.

I'll go ahead and update the quality field comment.

@selewis
Copy link

selewis commented Jan 12, 2015

+1

On Mon, Jan 12, 2015 at 10:55 AM, haussler notifications@github.com wrote:

+1

On Mon, Jan 12, 2015 at 10:53 AM, Richard Durbin notifications@github.com

wrote:

Yes, very good to add these.
I wonder whether filter should be renamed to filtersFailed, and an empty
filtersFailed list should imply PASS.
Also, I think that the description for quality should be tightened up. I
suggest to add something like:

Standardly, the value is -10 log_10 p, where p is the probability of the
Variant being a real genetic variant given the data, under a
probablistic
model for the data given the variant and a population genetic model for
genetic variation.


Reply to this email directly or view it on GitHub
#220 (comment).


Reply to this email directly or view it on GitHub
#220 (comment).

@calbach
Copy link
Contributor

calbach commented Jan 12, 2015

+1

@richarddurbin's proposal for filtersFailed is much much cleaner than the existing semantics (sentinel value of single element array containing magic "PASS" constant). But perhaps it makes sense to keep parity with VCF for now, since I suspect all current 0.5.1 implementations use VCF-semantics, and work this change into a future version.

Perhaps there could be a follow-up PR which switches the name and semantics and the discussion can be separated?

@johntbates
Copy link
Author

@calbach I agree that the @richarddurbin's proposal is cleaner than the VCF-parity version, but as far as I know, all existing implementations report the sentinel value, if they report filter values at all.

@adamnovak
Copy link
Contributor

I would be in favor of dropping the PASS. The translation back and forth to
VCF is very clear, and we don't need to support grepping over all the
records for PASSing ones, or have any potential problems with empty fields.

On Mon, Jan 12, 2015 at 2:34 PM, John Bates notifications@github.com
wrote:

@calbach https://github.com/calbach I agree that the @richarddurbin
https://github.com/richarddurbin's proposal is cleaner than the
VCF-parity version, but as far as I know, all existing implementations
report the sentinel value, if they report filter values at all.


Reply to this email directly or view it on GitHub
#220 (comment).

@jeromekelleher
Copy link
Contributor

I think @richarddurbin's filtersFailed is much better, and trivial to derive from VCF. Existing implementations are for version ~0.5, which we've made lots of breaking changes to, so we shouldn't be too worried about maintaining compatibility with these.

@adamnovak
Copy link
Contributor

I think there's consensus around the filtersFailed implementation. @johntbates do you want to change this over to use that syntax?

@skeenan
Copy link
Member

skeenan commented Apr 10, 2015

@johntbates do want to change over to filtersFailed as per #220 (comment) ?

@fnothaft
Copy link
Contributor

As an enhancement, it might be better to have both filtersFailed and filtersPassed arrays. I like the filtersFailed semantics, but if you have an empty filtersFailed array, you can't tell whether you passed all filters, or didn't apply any filters.

@kozbo
Copy link
Contributor

kozbo commented Aug 3, 2016

Please see #656 to close this

@david4096
Copy link
Member

Closing in favor of #656

@david4096 david4096 closed this Jan 12, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.