[Bioc-devel] Bioc needs better support for variants

Martin Morgan mtmorgan at fhcrc.org
Fri Apr 15 16:19:02 CEST 2011

On 04/15/2011 06:05 AM, Vincent Carey wrote:
> I will comment on my limited view and progress.  I need to work from
> an exemplar.  I committed cheung2010 in the experimental data archive
> (devel only).  This relates to PMID 20856902, genetics of expression
> in immortalized B cells.
> There are 147 individuals with hapmap phase 3 genotypes and hgfocus
> arrays (:-( but about 45 have RNA-seq data in GEO.  fastq is available
> with the SRAtools fastq-dump and you can get the sra data reasonably
> quickly using ascp.  I will eventually make a sample from their
> RNA-seq data available in this package to look at SNP-driven
> allele-specific expression and other aspects of SNP-dependent
> expression regulation.
> Probably there is DNA-seq data out there on these coriell cell lines
> but for the moment I will be looking at the chip-based SNPs and
> imputation on those.  Better representations for 8 million SNP per
> sample would probably come in handy, but breaking them up by
> chromosome in SnpMatrix instances is OK so far.  I think we have to
> recognize that in any of these paradigms discrete calls are often not
> going to cut it, and uncertainty representations will be important.
> VCF representations of indels in 1000 genomes are available, but I
> don't know that we have good tools for importing and modeling those.
> Another exemplar that should be considered.
> On Fri, Apr 15, 2011 at 7:16 AM, Michael Lawrence
> <lawrence.michael at gene.com>  wrote:
>> Hi guys,
>> Congrats on the release. For this next one, one focus, in my opinion, should
>> be on analyzing variants in the context of sequencing data. This includes
>> infrastructure for things like calling variants (in DNA and RNA), as well as
>> determining their effects (e.g., coding and splicing changes). It would be
>> good if we could come up with a plan. If we had one, we could commit some
>> resources here to the problem.
>> Is anyone willing to help out on this? What do you guys think?

We could certainly play a role in annotation of variants and support for 
interfacing with established 3rd party formats. Obviously also the 
representation of variants that overlap with IRanges / Biostrings 
infrastructure. Martin

>> Thanks,
>> Michael
>>         [[alternative HTML version deleted]]
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

More information about the Bioc-devel mailing list