[BioC] Call for comments on analyzing aCGH data with huge number of probes on a single chromosome

Fri Apr 4 18:34:31 CEST 2008

I routinely use process control methods for analyzing
aCGH data (and began using this with Nimblegen data
where the number of probes overwhelmed the available R
code designed for significantly less dense arrays).

Process control can run through Nimblegen data in the
matter of minutes (I use SAS for this however) for a
chromosome and a few hours for a large number of
samples.

Basically the expression level for a copy number of 2
can be considered 'in control' and any amplified or
deleted region 'out of control'. These methods have
been developed and applied very productively over the
last 50 years.

The second step is to select out regions with 'special
cause' called regions (process control jargon) and
score them by a ratio of the MSE of the called region
to the MSE of the adjoining in control regions where
the MSE is calculated around the expression level for
a normal 2 copies.

Would be happy to send a manuscript if you email me at
the address below. 

Thanks
Bill Shannon, PhD
Associate Professor of Biostatistics in Medicine
Washington University School of Medicine
wshannon at wustl.edu

--- pingzhao Hu <phu at sickkids.ca> wrote:

> 
> Sean,
> Thanks!
> The gold is to identify copy number variation from
> normal human samples.
> I have tried CBS, cghFLasso 
>
(http://biostatistics.oxfordjournals.org/cgi/content/abstract/kxm013v1)
> our own method 
>
(http://biostatistics.oxfordjournals.org/cgi/content/abstract/kxl035v1),
> 
> etc methods.
> 
> Pingzhao
> 
> 
> At 11:45 AM 4/4/2008, Sean Davis wrote:
> >On Fri, Apr 4, 2008 at 11:38 AM, pingzhao Hu
> <phu at sickkids.ca> wrote:
> > >
> > >  Hi All,
> > >  I have a question about analyzing aCGH data
> with huge number of
> > >  probes on a single chromosome.
> > >  We have a set of customized NimbleGen aCGH
> human sample data. Each sample
> > >  has 40 million probes. Even a single chromosome
> has >3M probes.
> > >
> > >  I tried some R-based and Matlab-based aCGH
> analysis software to
> > >  analyze just a single chromosome in
> > >  a single sample using our supercomputer, but no
> hopes! Some software
> > >  just show error messages (works fine for small
> > >  data sets) and some software can not complete
> the analysis even after
> > >  1-2 days CPU time.
> > >
> > >  I am wondering whether any people in the list
> have experience in
> > >  analyzing the aCGH data with such a scale.
> > >  If you have, can you share some your experience
> with me?
> > >
> > >  Will it be a good idea to first divide the
> chromosome into some small
> > >  pieces (say each pieice has 10,000 probes) and
> then run the algorithm
> > >  on each piece of the chromosome?
> >
> >What are the goals of the analysis?  What types of
> samples (cancer,
> >comparative genomics, normal DNA)?  And what
> methods have you tried?
> >
> >Sean
> 
> 
> 
> ========================================
> Pingzhao Hu
> Statistical Analysis Facility
> The Centre for Applied Genomics (TCAG)
> The Hospital for Sick Children Research Institute
> MaRS Centre - East Tower
> 101 College Street, Room 15-705
> Toronto, Ontario, M5G 1L7, Canada
> Tel.: (416) 813-7654 x6016
> Email: phu at sickkids.ca
> Web: http://www.tcag.ca/statisticalAnalysis.html
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
>
http://news.gmane.org/gmane.science.biology.informatics.conductor
>