[GET-dev] PGP info

Ruth McCole rmccole at genetics.med.harvard.edu
Mon May 9 16:55:29 EDT 2011


Many thanks Madeline.

I'm going to be working with the complete genomics 'free' genomes as 
well as the PGP. It would be great for me to have the other files (CNVs 
etc) for the first 10 PGP genomes. I'm not really bothered about build 
37 versus 36.

Ruth

On 5/9/11 4:50 PM, Madeleine Ball wrote:
>> 1. I have not received a GET-dev digest since March, have I missed some?
> No, traffic has just been low lately. I admit I neglect posting
> updates, I'll try to post more since I know that someone is paying
> attention.
>
>> 2. What is the difference between PGP 1-10 (with indel and coverage) and the
>> other three? Are there scripts within get-evidence that check through
>> different complete genomics files to compile this 'with indels and coverage'
>> file?
> The other three were from Illumina data (or maybe Knome for 11&  12?)
> donated a while ago that lacked both indel data and coverage.
>
> The with-indel-and-coverage is just using the CGI var file. In this
> case we're using the word "coverage" to mean "confidently called as
> matching reference" (rather than "read depth"). There's a translator
> that's automatically run on an uploaded CGI var file:
> https://github.com/madprime/get-evidence/blob/master/server/conversion/cgivar_to_gff.py
>
> You likely already know about this, but there's also a bunch of
> genomes available from Complete Genomic's website:
> http://www.completegenomics.com/sequence-data/download-data/
>
>> 3. Is CNV and SD evidence available for the PGP genomes? It seems to me that
>> only 'short' indels (100bp or so) are available, from the look of the
>> entries in the right-most column of these .gff files.
> You're right, data for that other stuff is in different files, which
> weren't uploaded to GET-Evidence, at the moment you only have the
> source file that GET-Evidence used. I think Sasha has been meaning to
> get the full data onto an FTP somewhere. We've also been trying to get
> a redone build 37 version for the ten genomes (you might want to wait
> for that improved data).
>
> - Madeleine
>


-- 
Ruth McCole PhD
Postdoctoral Researcher
Wu Lab

Department of Genetics
Harvard Medical School
77 Avenue Louis Pasteur
Boston, MA 02115






More information about the Arvados mailing list