[Bioc-devel] Bioc needs better support for variants
mtmorgan at fhcrc.org
Fri Apr 15 16:19:45 CEST 2011
On 04/15/2011 06:05 AM, Vincent Carey wrote:
> I will comment on my limited view and progress. I need to work from
> an exemplar. I committed cheung2010 in the experimental data archive
> (devel only). This relates to PMID 20856902, genetics of expression
> in immortalized B cells.
> There are 147 individuals with hapmap phase 3 genotypes and hgfocus
> arrays (:-( but about 45 have RNA-seq data in GEO. fastq is available
> with the SRAtools fastq-dump and you can get the sra data reasonably
> quickly using ascp. I will eventually make a sample from their
> RNA-seq data available in this package to look at SNP-driven
> allele-specific expression and other aspects of SNP-dependent
> expression regulation.
> Probably there is DNA-seq data out there on these coriell cell lines
> but for the moment I will be looking at the chip-based SNPs and
> imputation on those. Better representations for 8 million SNP per
> sample would probably come in handy, but breaking them up by
> chromosome in SnpMatrix instances is OK so far. I think we have to
> recognize that in any of these paradigms discrete calls are often not
> going to cut it, and uncertainty representations will be important.
> VCF representations of indels in 1000 genomes are available, but I
> don't know that we have good tools for importing and modeling those.
> Another exemplar that should be considered.
> On Fri, Apr 15, 2011 at 7:16 AM, Michael Lawrence
> <lawrence.michael at gene.com> wrote:
>> Hi guys,
>> Congrats on the release. For this next one, one focus, in my opinion, should
>> be on analyzing variants in the context of sequencing data. This includes
>> infrastructure for things like calling variants (in DNA and RNA), as well as
>> determining their effects (e.g., coding and splicing changes). It would be
>> good if we could come up with a plan. If we had one, we could commit some
>> resources here to the problem.
>> Is anyone willing to help out on this? What do you guys think?
We could certainly play a role in annotation of variants and support for
interfacing with established 3rd party formats. Obviously also the
representation of variants that overlap with IRanges / Biostrings
>> [[alternative HTML version deleted]]
>> Bioc-devel at r-project.org mailing list
> Bioc-devel at r-project.org mailing list
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Telephone: 206 667-2793
More information about the Bioc-devel