[BioC] gains and losses via mode shifting

Oscar Rueda omrueda at cnio.es
Wed Jul 2 16:17:39 CEST 2008


Yes, I have always thought that was the more plausible biological reason.  
The other (statistical) possibililty is that the noise distribution for  
normal copy probes is not gaussian, so depending on its shape a single  
smoothed mean might not be the best summary.

Oscar


On Wed, 02 Jul 2008 15:57:16 +0200, Benjamin Otto  
<b.otto at uke.uni-hamburg.de> wrote:

> Hmm, maybe you can help me understand that point a little bit better. I'm
> still not sure I really understand what I do see in this sample.
>
> Let me assume, even if it might not be true, that we are talking about
> tetraploid tumor cells. Let me take tetraploids to have a little bigger
> range for loss levels so the level changes might not always be so clear.  
> So
> from a technical point of view if I don't have any gains or losses then I
> would expect all the segment means to be on one level right? That's  
> because
> the oligos for all chromosomes are distributed over the whole chip so any
> noise should apply for all chromosomes equally. There should be no bias  
> "per
> complete chromosome" in terms of physical position of oligos on the chip,
> hybridization quality or affinity or dye effects. All these should apply
> equally to all chromosomes. How can I observe clear shifts right on the
> border between chromosomes, even if they are small, which would not
> correspond to a biological difference in copy number? Why should the  
> break
> be just right between the single chromosomes? Is there a technical system
> effect which can result in such profiles?
>
> The only thing occurring to my mind is a heterogeneous mixture of cells  
> who
> have different copy numbers for certain chromosomes. So a segment mean  
> would
> not correspond to a defined number of copies but something in between.  
> But
> is there another explanation?
>
>
> Best regards,
>
> Benjamin
>
>
>
>
> -----Ursprüngliche Nachricht-----
> Von: Oscar Rueda [mailto:omrueda at cnio.es]
> Gesendet: Wednesday, July 02, 2008 11:10 AM
> An: Benjamin Otto; bioconductor at stat.math.ethz.ch
> Betreff: Re: AW: AW: [BioC] gains and losses via mode shifting
>
> Well, setting aside biological reasons to have these two modes, from a
> statistical point of view there is no problem in having two normal  
> levels.
> In the case of gaussian mixtures, this could occur if for example the
> distribution of the normal probes would have negative kurtosis, so two
> normal distributions would be needed to model it. In the case of DNACopy
> it is not so clear, because it is just a smoothing method but what I  
> would
> do is consider both levels as normal levels, if mergeLevels does not  
> merge
> them.
>
> Bets,
>
> Oscar M. Rueda
> Structural Computational Biology Group
> Spanish National Cancer Centre (CNIO)
> Madrid, SPAIN.
>
>
>
>
> On Tue, 01 Jul 2008 18:17:39 +0200, Benjamin Otto
> <b.otto at uke.uni-hamburg.de> wrote:
>
>> I'm, not sure, if changing to one of these methods will solve my  
>> problem.
>> Here is one of the samples I mean. The first picture is the CBS
>> segmentations. The second displays the density distribution of the
>> segments
>> on the right and the segments only on the left. The segments are colored
>> in
>> black in their original level and in red after shifting by the mode of
>> the
>> highest peak of the distribution.
>>
>> However, the distribution is what troubles me!!! I do agree with Sean
>> that
>> usually the lower mode seems more preferable. But this distribution  
>> looks
>> nearly mirrored by the y-axis. Have a look at the logratios and the
>> segments. Even if you merge some of the smaller segments with small  
>> inter
>> distance you will end up with a similar distribution of segments on both
>> sides of the x-axis.
>>
>> Or do I misinterpret the might of the methods you mentioned?
>>
>> If higher picture quality is needed, send me a note.
>>
>> Thanks for your replies until now. :)
>>
>> Best regards,
>>
>> Benjamin
>>
>>
>> -----Ursprüngliche Nachricht-----
>> Von: Oscar Rueda [mailto:omrueda at cnio.es]
>> Gesendet: Tuesday, July 01, 2008 5:13 PM
>> An: Benjamin Otto; bioconductor at stat.math.ethz.ch
>> Betreff: Re: AW: [BioC] gains and losses via mode shifting
>>
>> Well, smoothed means are difficult to translate to alterations, such as
>> 'loss' or 'gain'. They are on the scale of the log-ratios, so they  
>> depend
>> a lot on the variability of each array. I wouldn't expect the same  
>> levels
>> for a set of arrays, even if they are normalized. The main problem with
>> methods as DNACopy is that you only have a smoothed mean, but not a
>> measure of the precision of that mean. You can use the MergeLevels
>> algorithm (Willenbrock and Fridlyand 2005) to reduce the number of
>> possible smoothed means in a hypothesis test fashion until you only have
>> levels for 'loss', 'normal' and 'gain', but this approach does not  
>> always
>> produce good results.
>> Methods based on Hidden Markov models, such as aCGH package, BIOHMM or
>> our
>> package RJaCGH use hidden states and gaussian distributions to represent
>> copy numbers. Even in this case, every state does not have to correspond
>> to a different biological copy number, because we are fitting a mixture
>> of
>> normal distributions and if the normal probes have a skewed distribution
>> we will need several components to model that distribution. But in this
>> case we can use the means and the variances of these states to infere if
>> they are well above zero (in that case we could classify them as gains)
>> or
>> well below zero (in that case we could classify them as losses). This is
>> what our algorithm relabelStates() in RJCGH package does.
>>
>> Hope this helps,
>>
>> Oscar M. Rueda
>> Structural Computational Biology Group
>> Spanish National Cancer Centre (CNIO)
>> Madrid, SPAIN.
>>
>>
>>
>> On Tue, 01 Jul 2008 12:54:59 +0200, Benjamin Otto
>> <b.otto at uke.uni-hamburg.de> wrote:
>>
>>> The logratios are loess normalized with limma and the
>>> smoothing/segmentation
>>> is done with DNAcopy.
>>>
>>>
>>> The problem is that some of the samples seem to belong to maniac  
>>> tumors.
>>> The
>>> intriguing point for some samples is not really chromosomes 1-3, I only
>>> use
>>> them as a kind of clue, but more that I do observe two possible base
>>> lines
>>> which exhibit nearly comparable peaks in my density function. Each of
>>> them
>>> look as if it could be the real zero line, but I don't know which one.
>>> If I
>>> used some criterion like 2*SD(50% quantile) for detection of gains or
>>> losses
>>> then the shift direction would make a difference.
>>>
>>>
>>> Benjamin
>>>
>>>
>>>
>>> -----Ursprüngliche Nachricht-----
>>> Von: Oscar Rueda [mailto:omrueda at cnio.es]
>>> Gesendet: Tuesday, July 01, 2008 11:46 AM
>>> An: Benjamin Otto; bioconductor at stat.math.ethz.ch
>>> Betreff: Re: [BioC] gains and losses via mode shifting
>>>
>>> Dear Benjamin,
>>>
>>> I'm not sure if I understand correctly your problem, but are your
>>> samples
>>> normalized to have the same median?
>>>
>>> Oscar M. Rueda
>>> Structural Computational Biology Group
>>> Spanish National Cancer Centre (CNIO)
>>> Madrid, SPAIN.
>>>
>>> On Mon, 30 Jun 2008 13:02:46 +0200, Benjamin Otto
>>> <b.otto at uke.uni-hamburg.de> wrote:
>>>
>>>> Hi,
>>>>
>>>> After the segmentation of CGH data in some papers the results are
>>>> frequently
>>>> shifted by the density mode. To be more precise the mode of the  
>>>> highest
>>>> peak
>>>> is used. However this procedure depends on the condition that there is
>>>> clearly one prominent peak dominating the density function.
>>>>
>>>> Currently, in some of my samples, I do have the problem of two
>>>> prominent
>>>> peaks flanking the y-axis which make the decision about the correct
>>>> shift
>>>> direction a difficult one. Moreover in some of the cases a shift in  
>>>> one
>>>> direction seems to be obvious, in some other cases a shift in the  
>>>> other
>>>> direction seems more preferable and in a third group the preference is
>>>> not
>>>> quite clear. But in all groups a segmentation profile in chromosomes
>>>> 1-3
>>>> is
>>>> nearly identical which suggests that I do observe the same gain or  
>>>> loss
>>>> (depending on the shift direction) in all these samples.
>>>>
>>>> Does anyone have an idea how to assess this problem and how to solve
>>>> it?
>>>> Is
>>>> there another frequently used procedure aside the density mode  
>>>> shifting
>>>> used
>>>> for such data?
>>>>
>>>> I do have pictures of some samples displaying the problem but they are
>>>> too
>>>> big for the mailing list. Is there an official repository I can upload
>>>> them
>>>> to?
>>>>
>>>> Thanks in advance, best regards,
>>>>
>>>> Benjamin
>>>>
>>>> ======================================
>>>> Benjamin Otto
>>>> University Hospital Hamburg-Eppendorf
>>>> Institute For Clinical Chemistry
>>>> Martinistr. 52
>>>> D-20246 Hamburg
>>>>
>>>> Tel.: +49 40 42803 1908
>>>> Fax.: +49 40 42803 4971
>>>> ======================================
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los
>>> ficheros adjuntos, pueden contener información protegida para el uso
>>> exclusivo de su destinatario. Se prohíbe la distribución, reproducción  
>>> o
>>> cualquier otro tipo de transmisión por parte de otra persona que no sea
>>> el
>>> destinatario. Si usted recibe por error este correo, se ruega
>>> comunicarlo al
>>> remitente y borrar el mensaje recibido.
>>> **CONFIDENTIALITY NOTICE** This email communication and any attachments
>>> may
>>> contain confidential and privileged information for the sole use of the
>>> designated recipient named above. Distribution, reproduction or any
>>> other
>>> use of this transmission by any party other than the intended recipient
>>> is
>>> prohibited. If you are not the intended recipient please contact the
>>> sender
>>> and delete all copies.
>>>
>>>
>>>
>>
>>
>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los
>> ficheros adjuntos, pueden contener información protegida para el uso
>> exclusivo de su destinatario. Se prohíbe la distribución, reproducción o
>> cualquier otro tipo de transmisión por parte de otra persona que no sea
>> el
>> destinatario. Si usted recibe por error este correo, se ruega
>> comunicarlo al
>> remitente y borrar el mensaje recibido.
>> **CONFIDENTIALITY NOTICE** This email communication and any attachments
>> may
>> contain confidential and privileged information for the sole use of the
>> designated recipient named above. Distribution, reproduction or any  
>> other
>> use of this transmission by any party other than the intended recipient
>> is
>> prohibited. If you are not the intended recipient please contact the
>> sender
>> and delete all copies.
>>
>>
>>
>
>
>
> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los
> ficheros adjuntos, pueden contener información protegida para el uso
> exclusivo de su destinatario. Se prohíbe la distribución, reproducción o
> cualquier otro tipo de transmisión por parte de otra persona que no sea  
> el
> destinatario. Si usted recibe por error este correo, se ruega  
> comunicarlo al
> remitente y borrar el mensaje recibido.
> **CONFIDENTIALITY NOTICE** This email communication and any attachments  
> may
> contain confidential and privileged information for the sole use of the
> designated recipient named above. Distribution, reproduction or any other
> use of this transmission by any party other than the intended recipient  
> is
> prohibited. If you are not the intended recipient please contact the  
> sender
> and delete all copies.
>
>
>
>



-- 
Oscar M. Rueda
Structural Computational Biology Group
Spanish National Cancer Centre (CNIO)
Madrid, SPAIN.

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y ...{{dropped:3}}



More information about the Bioconductor mailing list