[GET-dev] Autoscores + Counsyl variants
rosenbaum4 at gmail.com
Fri May 21 09:02:45 EDT 2010
To help in troubleshooting:
CFTR Ser1255Stop should have 6 stars (stop codon, OMIM, GeneReviews).
Some of the ACADM genes have 0 stars; this gene has a Genetests entry,
is available for testing and is present in OMIM.
According to the latest download we have >80,000 nsSNPs in our
database (we should emphasize this point) but the variant_flat does
not produce a list of splice variants or all synonymous entries. I
think that it would be a good idea to get this data so that we can
further de-emphasize our reliance on exons.
On Fri, May 21, 2010 at 8:53 AM, Madeleine Price Ball <meprice at gmail.com> wrote:
> I've uploaded a new copy of the counsyl variants list, there were some
> ^M's in there, invisible to us when making the google spreadsheet.
> I guess LAMB3-R635X should have at least 5 points?
> 2 for nonsense mutation
> 2 for being in OMIM
> 1 for being a GeneTests testable gene
> I don't know if we should worry about how "lumpy" the database data
> is. Since much of the database is imported from OMIM we expect it to
> have a lot of 2's -- the profile for an individual will look
> different. Here's the list for PGP1:
> It's strange for the counsyl list to have any 0's, Sasha pointed out
> yesterday that almost by definition these are genes that "have testing
> available". Maybe the names aren't matching, or maybe Counsyl tests
> them but they don't have per-gene testing available as listed on
> Should we worry about how "lumpy" the Counsyl list looks? Sasha--any
> luck on getting an HCM list from Heidi?
> - Madeleine
> On Fri, May 21, 2010 at 12:41 AM, Tom Clegg
> <tom at scalablecomputingexperts.com> wrote:
>> Autoscores for all of the Counsyl variants are attached.
>> There were a few lines that look like they were corrupted by some
>> translation process (I ignored them):
>> Distribution of autoscores for counsyl variants: (select
>> autoscore,count(variant_id) from counsyl_autoscore group by autoscore)
>> | autoscore | count(variant_id) |
>> | 0 | 33 |
>> | 1 | 4 |
>> | 2 | 119 |
>> | 3 | 4 |
>> | 4 | 129 |
>> Distribution of autoscores for all variants: (cut -f40 latest-flat.tsv |
>> tail -n +2 | sort -n | uniq -c)
>> 62304 0
>> 3993 1
>> 10473 2
>> 1512 3
>> 3378 4
>> Presumably *some* variants should be getting scores >4 -- I'll have to look
>> at this tomorrow (examples welcome).
>> The "in genetests?" contribution to the above autoscores is based on whether
>> the gene is *listed* in genetests, not whether its record indicates "test
>> available"... contrary to what I told Madeleine today. I've fixed that just
>> now, and the scores are being recalculated. (64 of the 836 genes in
>> genetests are "no test available")
>> GET-dev mailing list
>> GET-dev at lists.freelogy.org
> GET-dev mailing list
> GET-dev at lists.freelogy.org
More information about the Arvados