[BioC] Call for comments on analyzing aCGH data with huge number of probes on a single chromosome
pingzhao Hu
phu at sickkids.ca
Fri Apr 4 21:48:14 CEST 2008
Sean,
Thanks,This is really helpful!
I just test the chromosome with 3.5M probes in a single sample, it
took less than 20 minutes to get the job done.
Dr. Shannon, I also very thank for your useful comments!
Have a great weekend.
Pingzhao
At 12:35 PM 4/4/2008, Sean Davis wrote:
>On Fri, Apr 4, 2008 at 12:09 PM, pingzhao Hu <phu at sickkids.ca> wrote:
> >
> > Sean,
> > Thanks!
> > The gold is to identify copy number variation from normal human samples.
> > I have tried CBS, cghFLasso
> > (http://biostatistics.oxfordjournals.org/cgi/content/abstract/kxm013v1)
> > our own method
> > (http://biostatistics.oxfordjournals.org/cgi/content/abstract/kxl035v1),
> > etc methods.
>
>You probably have a few options. First, you could try "smoothing" the
>data by using a moving window average or some such thing to reduce
>noise and reduce the number of probes. I think Nimblegen does this
>for data that they give back to customers when they do CGH for
>service. With the reduced-dimensionality data, you could then apply
>your method of choice. Obviously, you loose resolution doing this.
>Another alternative is an algorithm called "stepgram" developed by
>Doron Lipson. It is used in the CGHAnalytics commercial package
>available from Agilent (where it is called ADM-1). It is also
>available as a windows executable from here:
>
>http://bioinfo.cs.technion.ac.il/stepgram/
>
>I have an R package that uses that algorithm that, unfortunately, I am
>not allowed to distribute. That said, it is by far the fastest
>algorithm that I have tested for CGH analysis. For comparison, for
>200k probes, Stepgram runs in 4 seconds, aCGH in about 50 seconds,
>DNAcopy (CBS) and GLAD in about 400 seconds.
>
>Hope that helps,
>
>Sean
>
>
> > Pingzhao
> >
> >
> > At 11:45 AM 4/4/2008, Sean Davis wrote:
> > >On Fri, Apr 4, 2008 at 11:38 AM, pingzhao Hu <phu at sickkids.ca> wrote:
> > > >
> > > > Hi All,
> > > > I have a question about analyzing aCGH data with huge number of
> > > > probes on a single chromosome.
> > > > We have a set of customized NimbleGen aCGH human sample
> data. Each sample
> > > > has 40 million probes. Even a single chromosome has >3M probes.
> > > >
> > > > I tried some R-based and Matlab-based aCGH analysis software to
> > > > analyze just a single chromosome in
> > > > a single sample using our supercomputer, but no hopes! Some software
> > > > just show error messages (works fine for small
> > > > data sets) and some software can not complete the analysis even after
> > > > 1-2 days CPU time.
> > > >
> > > > I am wondering whether any people in the list have experience in
> > > > analyzing the aCGH data with such a scale.
> > > > If you have, can you share some your experience with me?
> > > >
> > > > Will it be a good idea to first divide the chromosome into some small
> > > > pieces (say each pieice has 10,000 probes) and then run the algorithm
> > > > on each piece of the chromosome?
> > >
> > >What are the goals of the analysis? What types of samples (cancer,
> > >comparative genomics, normal DNA)? And what methods have you tried?
> > >
> > >Sean
> >
> >
> >
> > ========================================
> > Pingzhao Hu
> > Statistical Analysis Facility
> > The Centre for Applied Genomics (TCAG)
> > The Hospital for Sick Children Research Institute
> > MaRS Centre - East Tower
> > 101 College Street, Room 15-705
> > Toronto, Ontario, M5G 1L7, Canada
> > Tel.: (416) 813-7654 x6016
> > Email: phu at sickkids.ca
> > Web: http://www.tcag.ca/statisticalAnalysis.html
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
========================================
Pingzhao Hu
Statistical Analysis Facility
The Centre for Applied Genomics (TCAG)
The Hospital for Sick Children Research Institute
MaRS Centre - East Tower
101 College Street, Room 15-705
Toronto, Ontario, M5G 1L7, Canada
Tel.: (416) 813-7654 x6016
Email: phu at sickkids.ca
Web: http://www.tcag.ca/statisticalAnalysis.html
More information about the Bioconductor
mailing list