[BioC] segmentation aCGH data

Sean Davis sdavis2 at mail.nih.gov
Wed Oct 10 18:15:52 CEST 2007


jhs1jjm at leeds.ac.uk wrote:
> Hi Sean,
> 
> As its 2 colour so I'm looking at relative amounts wouldn't that mean I
> wouldn't see copy number variants, would they not be in both my
> samples? I was also pondering the advantages of using R and
> bioconductor, vs say Agilent's z score, for the purposes of my
> discussion. Is the simple answer simply a flexible approach to these
> matters? Also if possible could you expand a bit in regards to the
> single probes argument.

If using Agilent CGHAnalytics, you will probably want to use ADM-1, not
z-score.  For the 44k arrays, a threshold of around 6 is probably
appropriate.  For the 244k arrays, something closer to 10 or 11 is more
appropriate.  ADM-1 is exquisitely sensitive to single probes that are
extreme values.  These may represent real signal, or may be noise.
There is no way to tell without validation, in my opinion.  However, If
there are two or more probes behaving similarly, then you can be more
assured of real biology.  The real biology could be directly
disease-related or not.  The ones that are not are copy number variants
(although there is now plenty of evidence that copy number variants can
be disease-associated, as well).  When using high-resolution oligo
arrays, you will need to become familiar with copy number polymorphism
and databases for annotating them.  CGHAnalytics contains a catalog of
those built-in.

As for R/Bioc versus commercial packages, that will be dictated by the
questions you want to ask.  We find that we routinely need and want to
ask questions that are not easily answered by commercial packages.  That
said, a good visualization tool for CGH is HIGHLY useful, and there are
now several available.

Sean



More information about the Bioconductor mailing list