[GET-dev] BioNotate for annotation of double associations, some bugs in GET-E, suggestions to amend BioNotate instructions

Leon Peshkin peshkin at gmail.com
Thu Apr 12 16:56:27 EDT 2012

A few observations resulting from me annotating many papers via BioNotate

1. Double Statements.
    There are papers reporting on association of a certain variant with
double phenotype, lile
"We studied associations of rs4646 with autism and bipolar disorders. Our
findings suggest that
it is positively associated with bipolar, and not associated with autism".
   In this case it is impossible to summarize the findings with our current
multi-choice, what needs to happen is
  a possibility for a curator to mark up "autism" only as a phenotype and
respective choice of summary,
  then go back to the same paper and mark up "bipolar" as phenotype and
choose different summary.
  Curator's instructions should change accordingly.

2. UNRELATED terms.
   There are often phenotypes, genes and variants which are NOT related to
the statement in question.
"We investigated rs4646 and rs13345345 as well as variants of gene p53 to
common flu and headaches".
  Instructions should ask curators not to label any phenotypes. genes or
variants which are not directly
related to the statement supported by the summary choice. One other goal
here is to learn synonyms from
assuming that all genes and phenotypes marked as such in a single abstract
are related.

3. quoted support
   I think it will be crucially important for the next round of automation
to have what we call "predicate support"
labeled. For example "our findings support that rs4646 is strongly
associated with the risk of breast cancer".
I'd label "is strongly associated with" as predicate support. Once the
sentence is parsed by automatic parser
and both rs4646 and "breast cancer" are identified as objects of the
relation, one would be able to train the
NLP system using these mark up instances.
   So, to this end in my annotations I have been using the "Sporadic
observation" mark-up instead of "predicate support"
but it'd be good to include this in instructions and provide a special

4. a bug.
   Using Chrome on Mac OSX, if I am trying to add new paper via PMID , type
in the number and instead of clicking "Add" button just push <Enter> I get
sent to Bionotate page with the abstract of the last paper on the page.

5. Double negations. Drugs.
    There are cases when a variant is associated with the "improved
reaction to anti-cancer drug X". One could be marking these up as
"positively associated with phenotype" in conjunction with marking up
"improved reaction to anti-cancer drug X" as a phenotype. Not sure what's a
good way to handle this. Also we might want to mark up medications and
think about "mediated relationships" where a variant mediates relationship
between two other fluents.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.arvados.org/pipermail/arvados/attachments/20120412/458f32c3/attachment-0001.html>

More information about the Arvados mailing list