[GET-dev] Penetrance / Attributable Risk

Madeleine Price Ball meprice at gmail.com
Wed Jul 7 22:33:15 EDT 2010

I think we need to create a category that reflects to what extent a
particular genetic variant increases the chance of disease/phenotype. I'm
sending this email out to both get-dev and get-editors since it's both
relevant to editors (a change involves establishing new scoring criteria)
and developers (a change in the website & database to include the new

First, let me review how it's currently handled. Penetrance is accounted for
in two ways in our system.

The first is the "severity" section in clinical impact. We have a hand-wavey
set of criteria that goes like this:
0 points for benign
1 point for very low expectation of having symptoms for this genotype, very
low penetrance (e.g., susceptibility to Crohn's with a 4-fold relative risk,
causing an overall risk of ~.7%)
2 points for mild effect on quality of life or unlikely to be symptomatic
3 points for moderate effect on quality of life (e.g., Familial
Mediterranean Fever)
4 points for severe effect: causes serious disability or reduces life
expectancy (e.g., Sickle-cell, Stargardt's disease)
5 points for very severe effect, lethal by early adulthood (e.g., Lethal
junctional epidermolysis bullosa, Adrenoleukodystrophy)

This sucks because we're conflating "likelihood of getting the disease" with
"how bad is the disease if you get it". But we had to put it in there
because of the example regarding Crohn's disease -- we needed something to
knock down the importance of variants with high ORs for very rare diseases.

The second is the odds ratio (OR) criteria in the "case/control evidence"
category. Here's a review of the current case/control scoring criteria:
0 points if no higher ranking is allowed
1 point if OR > 1 and significance <= 0.1
2 points if OR >= 1.5 and significance <= 0.05
3 points if OR >= 2 and significance <= 0.025
4 points if OR >= 3 and significance <= 0.01
5 points if OR >= 5 and significance <= 0.0001

This is a pretty weird thing for us to be doing -- odds ratio represents how
much increased risk a variant gives for a disease, not how likely it is to
be "real". It isn't really evidence for whether the variant has enough
evidence, it's a measure of how strong its clinical impact is. GWAS studies
typically produce data like this:
variants with a weak impact but strong significance. (Unfortunately the GWAS
studies are also often limited to a single ethnic group, which is a
different issue that I think we should consider problematic for the
"evidence" end of things.)

We could split OR into a different category, but I still think it would be
the wrong measure. It fails to distinguish between rare and common disease.
For an example, let me walk through two hypothetical cases of variants which
have 5% allele frequencies: one variant is associated with Crohn's disease
with an odds-ratio of 6.0, the second is associated with heart disease
variant with an odds-ratio of 1.9.

Assuming both of these are established to have very low p-values (i.e.,
significance), the Crohn's disease variant would currently rate 5 points and
the heart disease variant would rate 2 points. But Crohn's disease is really
rare (let's say 1 in 2000 = 0.05%), while heart disease is quite common
(let's say 8%).

In this case the odds ratio of 6.0 for Crohn's disease patients having this
allele translates to a relative risk of being a carrier of about 4.8 (based
on the allele frequency of 5%). So someone with Crohn's disease would be 4.8
times as likely to be carrying this variant (in other words, they have a 24%
allele frequency instead of 5%). This also means they are 4.8 times as
likely to get the disease if they have this variant. But since the disease
is so rare (1 in 2000) that translates to a risk of .24% instead of the
average of 0.05%. The amount of increased risk (called "attributable risk")
caused by this variant is 0.19%.

For the heart disease variant, the odds ratio of 1.9 means that individuals
with heart disease are 1.82 times as likely to be carrying this variant
(relative risk = 1.82). This means an individual is 1.71 times as likely to
have heart disease if they have this allele. Because the disease is fairly
common this translates to a risk of 13.7% instead of 8%. The amount of
attributable risk caused by this variant is 5.7%.

5.7% chance of heart disease vs. a 0.19% chance of Crohn's disease -- is it
appropriate to be treating the form as merely "2 points" and the latter as
"5 points"? I think the metric we use should respond to attributable risk
(the percentages 5.7% and 0.19%), not to the odds ratio. A moderately high
odds ratio for an extremely rare disease shouldn't trigger a high score.

I propose the case/control evidence score be changed to eliminate odds ratio
and that we create a new "Penetrance" category based. The "case/control
evidence" would look like this:
0 points if no higher ranking is allowed
1 point if significance <= 0.1
2 points if significance <= 0.05
3 points if significance <= 0.025
4 points if significance <= 0.01
5 points if significance <= 0.0001

I'm happier to have OR out of this section because now the ranking is purely
based on statistical significance, whether we think the variant is "real" at

And the "penetrance" would look like this:
0 points if < 0.1% attributable risk
1 points if >=0.1% attributable risk
2 points if >=1% attributable risk
3 points if >= 5% attributable risk
4 points if >= 25% attributable risk
5 points if >= 90% penetrance (complete or highly penetrant)

According to this scoring system, our hypothetical Crohn's disease variant
would get 1 point and our hypothetical heart disease variant would get 3
points. (Their old scores were 5 points and 2 points, respectively.) I think
this much more accurately reflects how important we think the variants are.
This "penetrance" category, since it's not measuring evidence but instead
measuring an aspect of clinical impact, would go under the "clinical impact"

After we break things up, we'll have to re-examine how our overall clinical
severity rating gets calculated. I think we'll also have to create tools for
translating odds ratios, etc. into attributable risk, I did it all with pen
and paper this time but I think it shouldn't be too difficult.

  -- Madeleine
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.arvados.org/pipermail/arvados/attachments/20100707/80f8440c/attachment-0001.html>

More information about the Arvados mailing list