[GET-dev] autoscore take II

Kimberly Robasky krobasky at gmail.com
Fri May 21 10:52:09 EDT 2010


Just suggestions...:

I've suggested refining the computational autoscoring points, and I
personally think +2 just because it's in OMIM (which we assert in our
paper has many false positives) is too high, so I'd back that down to
+1.  What do you guys think of the of following proposed changes to
the autoscoring algorithm?

Autoscoring is split into three sections: computational,
variant-specific, and gene-specific, each 2 points for a max of 6
points.

Computational (max of 2 points):
If an non-synonymous substitution:
+0.5 for NBLOSUM >= -3,
+1 for NBLOSUM >= 3,
another +.5 (+1.5 total) if NBLOSUM = 10 (a nonsense mutation)
and another +.5 (+2.0 total) if it causes a stop codon

For all variants:
+1 if within 1 base of a splice site
(Trait-o-matic has splicing data, presumably it could spot these?)
Indels:
+1 for indel in coding region
another +1 (+2 total) if causes a frameshift (not a multiple of 3 in
coding region)

Variant-specific lists (max of 2 points):
+1 if in OMIM
+1 if in PharmGKB
+1 if in HuGENet
another +1 (+2 total) if HuGENet's OR >= 1.5

Gene-specific lists (max of 2 points):
+1 if in GeneTests
another +1 (+2 total) if in a GeneReviews gene


--Kim
PS -- My hope is to take our best guess at "most likely relevant"
variants and look for any phosphorylation motifs that break: I can let
you know if I find anything of additional value.  Currently, the
unfiltered OMIM data are just too noisy to be of use at this point, as
was also intimated by Shamil.




More information about the Arvados mailing list