[GET-dev] Autoscores + Counsyl variants
krobasky at gmail.com
Fri May 21 10:22:06 EDT 2010
Another suggestion - add an autoscore point to anything that creates a
stop codon in a coding region?
On Fri, May 21, 2010 at 9:53 AM, Kimberly Robasky <krobasky at gmail.com> wrote:
> I think if your scoring values were more graded, your autoscore data
> would be less "lumpy".
> For example, I think its too conservative to require your BLOSUM
> scores to be >3; So I would suggest looking breaking it down to 1
> point for >= -3 and 2 points for > 3, and maybe even boosting your
> other scores to compensate. Here's why:
> Look at mutations for Serine; mutating away from that S kills any
> phosphorylation motif that might be there. This will almost certainly
> cause some kind of phenotype, but reading down the column of serine,
> you won't find any >3's at all. However, you find 16 that are >= -3.
> That's more than any other residue, with Tyrosine coming in second
> place, the other important phosphorylation residue, with 15 scores
> that are >-3.
> I think it's particularly worth emphasizing BLOSUM scores, given what
> we learned from Shamil about how polyphen works, and that BLOSUM is
> the only indicator we have for conservation (even if its only a rough
> On Fri, May 21, 2010 at 9:02 AM, Abraham Rosenbaum <rosenbaum4 at gmail.com> wrote:
>> To help in troubleshooting:
>> CFTR Ser1255Stop should have 6 stars (stop codon, OMIM, GeneReviews).
>> Some of the ACADM genes have 0 stars; this gene has a Genetests entry,
>> is available for testing and is present in OMIM.
>> According to the latest download we have >80,000 nsSNPs in our
>> database (we should emphasize this point) but the variant_flat does
>> not produce a list of splice variants or all synonymous entries. I
>> think that it would be a good idea to get this data so that we can
>> further de-emphasize our reliance on exons.
>> On Fri, May 21, 2010 at 8:53 AM, Madeleine Price Ball <meprice at gmail.com> wrote:
>>> I've uploaded a new copy of the counsyl variants list, there were some
>>> ^M's in there, invisible to us when making the google spreadsheet.
>>> I guess LAMB3-R635X should have at least 5 points?
>>> 2 for nonsense mutation
>>> 2 for being in OMIM
>>> 1 for being a GeneTests testable gene
>>> I don't know if we should worry about how "lumpy" the database data
>>> is. Since much of the database is imported from OMIM we expect it to
>>> have a lot of 2's -- the profile for an individual will look
>>> different. Here's the list for PGP1:
>>> It's strange for the counsyl list to have any 0's, Sasha pointed out
>>> yesterday that almost by definition these are genes that "have testing
>>> available". Maybe the names aren't matching, or maybe Counsyl tests
>>> them but they don't have per-gene testing available as listed on
>>> Should we worry about how "lumpy" the Counsyl list looks? Sasha--any
>>> luck on getting an HCM list from Heidi?
>>> - Madeleine
>>> On Fri, May 21, 2010 at 12:41 AM, Tom Clegg
>>> <tom at scalablecomputingexperts.com> wrote:
>>>> Autoscores for all of the Counsyl variants are attached.
>>>> There were a few lines that look like they were corrupted by some
>>>> translation process (I ignored them):
>>>> Distribution of autoscores for counsyl variants: (select
>>>> autoscore,count(variant_id) from counsyl_autoscore group by autoscore)
>>>> | autoscore | count(variant_id) |
>>>> | 0 | 33 |
>>>> | 1 | 4 |
>>>> | 2 | 119 |
>>>> | 3 | 4 |
>>>> | 4 | 129 |
>>>> Distribution of autoscores for all variants: (cut -f40 latest-flat.tsv |
>>>> tail -n +2 | sort -n | uniq -c)
>>>> 62304 0
>>>> 3993 1
>>>> 10473 2
>>>> 1512 3
>>>> 3378 4
>>>> Presumably *some* variants should be getting scores >4 -- I'll have to look
>>>> at this tomorrow (examples welcome).
>>>> The "in genetests?" contribution to the above autoscores is based on whether
>>>> the gene is *listed* in genetests, not whether its record indicates "test
>>>> available"... contrary to what I told Madeleine today. I've fixed that just
>>>> now, and the scores are being recalculated. (64 of the 836 genes in
>>>> genetests are "no test available")
>>>> GET-dev mailing list
>>>> GET-dev at lists.freelogy.org
>>> GET-dev mailing list
>>> GET-dev at lists.freelogy.org
>> GET-dev mailing list
>> GET-dev at lists.freelogy.org
More information about the Arvados