[Bioc-devel] Bioc needs better support for variants

Vincent Carey stvjc at channing.harvard.edu
Fri Apr 15 15:05:56 CEST 2011


I will comment on my limited view and progress.  I need to work from
an exemplar.  I committed cheung2010 in the experimental data archive
(devel only).  This relates to PMID 20856902, genetics of expression
in immortalized B cells.

There are 147 individuals with hapmap phase 3 genotypes and hgfocus
arrays (:-( but about 45 have RNA-seq data in GEO.  fastq is available
with the SRAtools fastq-dump and you can get the sra data reasonably
quickly using ascp.  I will eventually make a sample from their
RNA-seq data available in this package to look at SNP-driven
allele-specific expression and other aspects of SNP-dependent
expression regulation.

Probably there is DNA-seq data out there on these coriell cell lines
but for the moment I will be looking at the chip-based SNPs and
imputation on those.  Better representations for 8 million SNP per
sample would probably come in handy, but breaking them up by
chromosome in SnpMatrix instances is OK so far.  I think we have to
recognize that in any of these paradigms discrete calls are often not
going to cut it, and uncertainty representations will be important.

VCF representations of indels in 1000 genomes are available, but I
don't know that we have good tools for importing and modeling those.
Another exemplar that should be considered.

On Fri, Apr 15, 2011 at 7:16 AM, Michael Lawrence
<lawrence.michael at gene.com> wrote:
> Hi guys,
>
> Congrats on the release. For this next one, one focus, in my opinion, should
> be on analyzing variants in the context of sequencing data. This includes
> infrastructure for things like calling variants (in DNA and RNA), as well as
> determining their effects (e.g., coding and splicing changes). It would be
> good if we could come up with a plan. If we had one, we could commit some
> resources here to the problem.
>
> Is anyone willing to help out on this? What do you guys think?
>
> Thanks,
> Michael
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>



More information about the Bioc-devel mailing list