[BioC] pint (& segmentation)

Leo Lahti lmlahti at cis.hut.fi
Sat May 8 13:19:29 CEST 2010


Dear Raquel,

Thanks for the report. The current pint version has been designed for and 
tested with aCGH probe-level data. The framework should be applicable to 
segmented data as well. We are now working on this, and will let you know 
as soon as the issue has been fixed.

Note that segmentation as preprocessing is not necessarily required. The 
methods (pCCA/pSimCCA etc.) detect the strongest shared signal of the 
probes within the investigated chromosomal region. This corresponds to 
automatic segmentation with explicit modeling assumptions. Ordinary 
segmentation approaches are likely to loose information when summarizing 
individual probe-level observations into a single segment-level value; 
pint tries to avoid such information loss. Segmentation might help to 
avoid overfitting when sample size (number of arrays) is particularly 
small (compared to the number of probes within the region); otherwise we 
would recommend operating directly on probe-level observations with the 
pSimCCA method, which has proved robust to small sample sizes in our 
experiments.

The currently implemented models (pPCA/pFA/pCCA/pSimCCA) assume 
approximately normally distributed data (log2 fold change values) for both 
gene expression and copy number mesurements. As a standard quality check, 
one should also confirm that the technical biases between the arrays are 
minimized before the analysis.


with best regards

Leo Lahti

Aalto University School of Science and Technology tel: +358 (0)9 470 25116
Department of Information and Computer Science    email: leo.lahti at tkk.fi
P.O. Box 15400, FI-00076 Aalto, FINLAND           http://www.cis.hut.fi/lmlahti



Date: Fri, 07 May 2010 10:16:32 +0200
From: =?UTF-8?B?UmFxdWVsIE1hcnTCkmluZXogR2FyY8KSaWE=?=
  <rmartinezg at cnio.es>
User-Agent: Thunderbird 2.0.0.24 (Macintosh/20100228)
MIME-Version: 1.0
To: Mailing List Bioconductor <bioconductor at stat.math.ethz.ch>
Subject: pint
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit

Hello!

Thanks Mr. Carey for your reply.

Any genes have the same values because the data are segmented.
Now I need to know if the both methods pCCA and pSimCCA have any
problems for work with segmented data or  do not these methods work for
another reason?.

Thanks in advance
Raquel

-- 
********************************************

Raquel Martinez Garcia, Graduate Student
Gastrointestinal Cancer Clinical Research Unit
& Structural Computational Biology Group
Spanish National Cancer Research Center, CNIO
Melchor Fernandez Almagro, 3.
28029 Madrid, Spain.
Phone: +34 91 732 80 00 #3015
rmartinezg at cnio.es
http://www.cnio.es



More information about the Bioconductor mailing list