[GET-dev] Re: Getting to the bottom of which variant one is looking at ?

Leon Peshkin peshkin at gmail.com
Thu Apr 12 16:28:05 EDT 2012

Hi Madeleine,
   thanks for getting back to me on this. My main issue remains - I do not
know what it means exactly when I see that a given genome has "Allele
frequency of 18%" of  rs4646 , and in turn of "WNT11  R666K".
In case of dbSNP it is not clear at all which of possibilities is found in
the genome in what fraction,
in case of GET-e encoding, it is clear that  the variant in question is "K"
instead of "R" in the reference genome, but can I conclude that there are
18% of reads supporting "K" and the remaining 82% support "R" or am I
completely lost ?
So at least in dbSNP case, not clear which variant the individual is
carrying of all possible variants and whether a given publication implies
that this rs4646 implies that phenotype is more likely or respectively less


Finally, is there a way to include all the info you describe below (which
strand, which nucleotide etc) on the gene page somewhere as "auxiliary" it
would be very helpful  ?

On Tue, Apr 10, 2012 at 5:35 PM, Madeleine Ball <mpball at gmail.com> wrote:

> The dbSNP annotations are less specific than GET-Evidence: as I understand
> it, they're not even specifying a *genotype* -- just a *location*.
> GET-Evidence does alias some dbSNP IDs to amino acid changes -- when it
> does this, it's assuming that the dbSNP ID refers to the nonreference amino
> acid change predicted.
> For example:
> http://evidence.personalgenomes.org/rs429358
> redirects to:
> http://evidence.personalgenomes.org/APOE-C130R
> I'm not sure what your question about zygosity is asking.
> If you're looking at a SNP which is on a gene which runs in "reverse" on
> the genome, you'll often find inconsistency in literature based on the
> perspective taken when reporting the variant. For example, a variant is A
> or G on the reference genome, but from the perspective of the gene
> transcript, it's T or C (reverse complemented). You can see how this would
> be almost impossibly confusing if the variant happens to be C/G or A/T...
> With amino acid changes it is a lot less ambiguous. Sometimes you'll find
> a change reported the other way in a paper (say, "APOE R130C" instead of
> "APOE C130R") -- in this case, the paper's attitude is that the "R" is
> "reference" and the "C" is variant. This can happen if they aren't
> interested in what the reference genome says about the issue (the paper has
> it's own opinion about what is "wildtype" -- because sometimes the
> reference genome is actually the more rare variant).
> The last letter in an ID like "APOE-C130R" is the actual variant allele,
> the first letter is just a reference point describing the original
> "reference" allele. (Think of it as error checking, the "C" part of the
> variant isn't necessary, it's just an error check.) We've occasionally made
> GET-E entries for "reference" variants using the nomenclature "APOE-C130C"
> ... but I should note that we haven't been using these in automatic
> interpretation, the automatic system currently only examines differences
> from the reference genome.
> Hope this helps.
> -- Madeleine
> On Tue, Apr 10, 2012 at 4:13 PM, Leon Peshkin <peshkin at gmail.com> wrote:
>> Hi everyone,
>>    after doing a whole bunch of curation at GET-e, I realized I am
>> confused at a very fundamental level - which
>> variant exactly is present in a given genome - I would appreciate some
>> feedback and help. Let's say there is
>> a variant
>>     Notch3   P34R  also known as rs1234
>> which corresponds to nucleotide C changing to G.
>>   the definition of rs...... often (but not always) has all possible
>> variants (C;C) (C;G) (G;G) like this one
>> rs4646    http://www.snpedia.com/index.php/Rs4646
>>   and your genome has allele frequency of 31%.
>> What exactly does it mean ? Where does zygosity annotation come from and
>> why is it missing in many variants?
>> Literature sometimes talks about rs1234 sometimes specifies nucleotide,
>> sometimes amino acid, plus sometimes there is confusion /inconsistency
>> among papers which strand the variant is on (if non-coding variants).
>>   Is there a way to facilitate the annotation by specifying the variant
>> in more than one way in GET-e ?
>> -Leon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.arvados.org/pipermail/arvados/attachments/20120412/20ec7c52/attachment-0001.html>

More information about the Arvados mailing list