[BioC] gains and losses via mode shifting

Sean Davis sdavis2 at mail.nih.gov
Wed Jul 2 21:41:39 CEST 2008


Oscar and Benjamin,

I do not think that one needs to suggest that two modes is a
statistical issue related to non-normal noise distribution.  There are
at least two perfectly plausible biological interpretations of this
situation.

1)  Aneuploidy of a large proportion of the genome (but homogeneous population)
2)  Tissue heterogeneity (two or more distinct populations with
different copy number profiles)

Sean




On Wed, Jul 2, 2008 at 10:31 AM, Benjamin Otto
<b.otto at uke.uni-hamburg.de> wrote:
> Good point. But then if I do assume a statistical effect what would I expect
> concerning expression arrays of the same sample? Would the same noise
> distribution fit to these data? In other words: Would this distribution type
> be a feature of the sample or a feature of the technique (aCGH specific)
> itself?
>
> Benjamin
>
>
>
> -----Ursprüngliche Nachricht-----
> Von: Oscar Rueda [mailto:omrueda at cnio.es]
> Gesendet: Wednesday, July 02, 2008 4:18 PM
> An: Benjamin Otto; bioconductor at stat.math.ethz.ch
> Betreff: Re: AW: AW: AW: [BioC] gains and losses via mode shifting
>
> Yes, I have always thought that was the more plausible biological reason.
> The other (statistical) possibililty is that the noise distribution for
> normal copy probes is not gaussian, so depending on its shape a single
> smoothed mean might not be the best summary.
>
> Oscar
>
>
> On Wed, 02 Jul 2008 15:57:16 +0200, Benjamin Otto
> <b.otto at uke.uni-hamburg.de> wrote:
>
>> Hmm, maybe you can help me understand that point a little bit better. I'm
>> still not sure I really understand what I do see in this sample.
>>
>> Let me assume, even if it might not be true, that we are talking about
>> tetraploid tumor cells. Let me take tetraploids to have a little bigger
>> range for loss levels so the level changes might not always be so clear.
>> So
>> from a technical point of view if I don't have any gains or losses then I
>> would expect all the segment means to be on one level right? That's
>> because
>> the oligos for all chromosomes are distributed over the whole chip so any
>> noise should apply for all chromosomes equally. There should be no bias
>> "per
>> complete chromosome" in terms of physical position of oligos on the chip,
>> hybridization quality or affinity or dye effects. All these should apply
>> equally to all chromosomes. How can I observe clear shifts right on the
>> border between chromosomes, even if they are small, which would not
>> correspond to a biological difference in copy number? Why should the
>> break
>> be just right between the single chromosomes? Is there a technical system
>> effect which can result in such profiles?
>>
>> The only thing occurring to my mind is a heterogeneous mixture of cells
>> who
>> have different copy numbers for certain chromosomes. So a segment mean
>> would
>> not correspond to a defined number of copies but something in between.
>> But
>> is there another explanation?
>>
>>
>> Best regards,
>>
>> Benjamin
>>
>>
>>
>>
>> -----Ursprüngliche Nachricht-----
>> Von: Oscar Rueda [mailto:omrueda at cnio.es]
>> Gesendet: Wednesday, July 02, 2008 11:10 AM
>> An: Benjamin Otto; bioconductor at stat.math.ethz.ch
>> Betreff: Re: AW: AW: [BioC] gains and losses via mode shifting
>>
>> Well, setting aside biological reasons to have these two modes, from a
>> statistical point of view there is no problem in having two normal
>> levels.
>> In the case of gaussian mixtures, this could occur if for example the
>> distribution of the normal probes would have negative kurtosis, so two
>> normal distributions would be needed to model it. In the case of DNACopy
>> it is not so clear, because it is just a smoothing method but what I
>> would
>> do is consider both levels as normal levels, if mergeLevels does not
>> merge
>> them.
>>
>> Bets,
>>
>> Oscar M. Rueda
>> Structural Computational Biology Group
>> Spanish National Cancer Centre (CNIO)
>> Madrid, SPAIN.
>>
>>
>>
>>
>> On Tue, 01 Jul 2008 18:17:39 +0200, Benjamin Otto
>> <b.otto at uke.uni-hamburg.de> wrote:
>>
>>> I'm, not sure, if changing to one of these methods will solve my
>>> problem.
>>> Here is one of the samples I mean. The first picture is the CBS
>>> segmentations. The second displays the density distribution of the
>>> segments
>>> on the right and the segments only on the left. The segments are colored
>>> in
>>> black in their original level and in red after shifting by the mode of
>>> the
>>> highest peak of the distribution.
>>>
>>> However, the distribution is what troubles me!!! I do agree with Sean
>>> that
>>> usually the lower mode seems more preferable. But this distribution
>>> looks
>>> nearly mirrored by the y-axis. Have a look at the logratios and the
>>> segments. Even if you merge some of the smaller segments with small
>>> inter
>>> distance you will end up with a similar distribution of segments on both
>>> sides of the x-axis.
>>>
>>> Or do I misinterpret the might of the methods you mentioned?
>>>
>>> If higher picture quality is needed, send me a note.
>>>
>>> Thanks for your replies until now. :)
>>>
>>> Best regards,
>>>
>>> Benjamin
>>>
>>>
>>> -----Ursprüngliche Nachricht-----
>>> Von: Oscar Rueda [mailto:omrueda at cnio.es]
>>> Gesendet: Tuesday, July 01, 2008 5:13 PM
>>> An: Benjamin Otto; bioconductor at stat.math.ethz.ch
>>> Betreff: Re: AW: [BioC] gains and losses via mode shifting
>>>
>>> Well, smoothed means are difficult to translate to alterations, such as
>>> 'loss' or 'gain'. They are on the scale of the log-ratios, so they
>>> depend
>>> a lot on the variability of each array. I wouldn't expect the same
>>> levels
>>> for a set of arrays, even if they are normalized. The main problem with
>>> methods as DNACopy is that you only have a smoothed mean, but not a
>>> measure of the precision of that mean. You can use the MergeLevels
>>> algorithm (Willenbrock and Fridlyand 2005) to reduce the number of
>>> possible smoothed means in a hypothesis test fashion until you only have
>>> levels for 'loss', 'normal' and 'gain', but this approach does not
>>> always
>>> produce good results.
>>> Methods based on Hidden Markov models, such as aCGH package, BIOHMM or
>>> our
>>> package RJaCGH use hidden states and gaussian distributions to represent
>>> copy numbers. Even in this case, every state does not have to correspond
>>> to a different biological copy number, because we are fitting a mixture
>>> of
>>> normal distributions and if the normal probes have a skewed distribution
>>> we will need several components to model that distribution. But in this
>>> case we can use the means and the variances of these states to infere if
>>> they are well above zero (in that case we could classify them as gains)
>>> or
>>> well below zero (in that case we could classify them as losses). This is
>>> what our algorithm relabelStates() in RJCGH package does.
>>>
>>> Hope this helps,
>>>
>>> Oscar M. Rueda
>>> Structural Computational Biology Group
>>> Spanish National Cancer Centre (CNIO)
>>> Madrid, SPAIN.
>>>
>>>
>>>
>>> On Tue, 01 Jul 2008 12:54:59 +0200, Benjamin Otto
>>> <b.otto at uke.uni-hamburg.de> wrote:
>>>
>>>> The logratios are loess normalized with limma and the
>>>> smoothing/segmentation
>>>> is done with DNAcopy.
>>>>
>>>>
>>>> The problem is that some of the samples seem to belong to maniac
>>>> tumors.
>>>> The
>>>> intriguing point for some samples is not really chromosomes 1-3, I only
>>>> use
>>>> them as a kind of clue, but more that I do observe two possible base
>>>> lines
>>>> which exhibit nearly comparable peaks in my density function. Each of
>>>> them
>>>> look as if it could be the real zero line, but I don't know which one.
>>>> If I
>>>> used some criterion like 2*SD(50% quantile) for detection of gains or
>>>> losses
>>>> then the shift direction would make a difference.
>>>>
>>>>
>>>> Benjamin
>>>>
>>>>
>>>>
>>>> -----Ursprüngliche Nachricht-----
>>>> Von: Oscar Rueda [mailto:omrueda at cnio.es]
>>>> Gesendet: Tuesday, July 01, 2008 11:46 AM
>>>> An: Benjamin Otto; bioconductor at stat.math.ethz.ch
>>>> Betreff: Re: [BioC] gains and losses via mode shifting
>>>>
>>>> Dear Benjamin,
>>>>
>>>> I'm not sure if I understand correctly your problem, but are your
>>>> samples
>>>> normalized to have the same median?
>>>>
>>>> Oscar M. Rueda
>>>> Structural Computational Biology Group
>>>> Spanish National Cancer Centre (CNIO)
>>>> Madrid, SPAIN.
>>>>
>>>> On Mon, 30 Jun 2008 13:02:46 +0200, Benjamin Otto
>>>> <b.otto at uke.uni-hamburg.de> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> After the segmentation of CGH data in some papers the results are
>>>>> frequently
>>>>> shifted by the density mode. To be more precise the mode of the
>>>>> highest
>>>>> peak
>>>>> is used. However this procedure depends on the condition that there is
>>>>> clearly one prominent peak dominating the density function.
>>>>>
>>>>> Currently, in some of my samples, I do have the problem of two
>>>>> prominent
>>>>> peaks flanking the y-axis which make the decision about the correct
>>>>> shift
>>>>> direction a difficult one. Moreover in some of the cases a shift in
>>>>> one
>>>>> direction seems to be obvious, in some other cases a shift in the
>>>>> other
>>>>> direction seems more preferable and in a third group the preference is
>>>>> not
>>>>> quite clear. But in all groups a segmentation profile in chromosomes
>>>>> 1-3
>>>>> is
>>>>> nearly identical which suggests that I do observe the same gain or
>>>>> loss
>>>>> (depending on the shift direction) in all these samples.
>>>>>
>>>>> Does anyone have an idea how to assess this problem and how to solve
>>>>> it?
>>>>> Is
>>>>> there another frequently used procedure aside the density mode
>>>>> shifting
>>>>> used
>>>>> for such data?
>>>>>
>>>>> I do have pictures of some samples displaying the problem but they are
>>>>> too
>>>>> big for the mailing list. Is there an official repository I can upload
>>>>> them
>>>>> to?
>>>>>
>>>>> Thanks in advance, best regards,
>>>>>
>>>>> Benjamin
>>>>>
>>>>> ======================================
>>>>> Benjamin Otto
>>>>> University Hospital Hamburg-Eppendorf
>>>>> Institute For Clinical Chemistry
>>>>> Martinistr. 52
>>>>> D-20246 Hamburg
>>>>>
>>>>> Tel.: +49 40 42803 1908
>>>>> Fax.: +49 40 42803 4971
>>>>> ======================================
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los
>>>> ficheros adjuntos, pueden contener información protegida para el uso
>>>> exclusivo de su destinatario. Se prohíbe la distribución, reproducción
>>>> o
>>>> cualquier otro tipo de transmisión por parte de otra persona que no sea
>>>> el
>>>> destinatario. Si usted recibe por error este correo, se ruega
>>>> comunicarlo al
>>>> remitente y borrar el mensaje recibido.
>>>> **CONFIDENTIALITY NOTICE** This email communication and any attachments
>>>> may
>>>> contain confidential and privileged information for the sole use of the
>>>> designated recipient named above. Distribution, reproduction or any
>>>> other
>>>> use of this transmission by any party other than the intended recipient
>>>> is
>>>> prohibited. If you are not the intended recipient please contact the
>>>> sender
>>>> and delete all copies.
>>>>
>>>>
>>>>
>>>
>>>
>>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los
>>> ficheros adjuntos, pueden contener información protegida para el uso
>>> exclusivo de su destinatario. Se prohíbe la distribución, reproducción o
>>> cualquier otro tipo de transmisión por parte de otra persona que no sea
>>> el
>>> destinatario. Si usted recibe por error este correo, se ruega
>>> comunicarlo al
>>> remitente y borrar el mensaje recibido.
>>> **CONFIDENTIALITY NOTICE** This email communication and any attachments
>>> may
>>> contain confidential and privileged information for the sole use of the
>>> designated recipient named above. Distribution, reproduction or any
>>> other
>>> use of this transmission by any party other than the intended recipient
>>> is
>>> prohibited. If you are not the intended recipient please contact the
>>> sender
>>> and delete all copies.
>>>
>>>
>>>
>>
>>
>>
>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los
>> ficheros adjuntos, pueden contener información protegida para el uso
>> exclusivo de su destinatario. Se prohíbe la distribución, reproducción o
>> cualquier otro tipo de transmisión por parte de otra persona que no sea
>> el
>> destinatario. Si usted recibe por error este correo, se ruega
>> comunicarlo al
>> remitente y borrar el mensaje recibido.
>> **CONFIDENTIALITY NOTICE** This email communication and any attachments
>> may
>> contain confidential and privileged information for the sole use of the
>> designated recipient named above. Distribution, reproduction or any other
>> use of this transmission by any party other than the intended recipient
>> is
>> prohibited. If you are not the intended recipient please contact the
>> sender
>> and delete all copies.
>>
>>
>>
>>
>
>
>
> --
> Oscar M. Rueda
> Structural Computational Biology Group
> Spanish National Cancer Centre (CNIO)
> Madrid, SPAIN.
>
> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los
> ficheros adjuntos, pueden contener información protegida para el uso
> exclusivo de su destinatario. Se prohíbe la distribución, reproducción o
> cualquier otro tipo de transmisión por parte de otra persona que no sea el
> destinatario. Si usted recibe por error este correo, se ruega comunicarlo al
> remitente y borrar el mensaje recibido.
> **CONFIDENTIALITY NOTICE** This email communication and any attachments may
> contain confidential and privileged information for the sole use of the
> designated recipient named above. Distribution, reproduction or any other
> use of this transmission by any party other than the intended recipient is
> prohibited. If you are not the intended recipient please contact the sender
> and delete all copies.
>
>
>
>
> --
> Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG):
>
> Universitätsklinikum Hamburg-Eppendorf
> Körperschaft des öffentlichen Rechts
> Gerichtsstand: Hamburg
>
> Vorstandsmitglieder:
> Prof. Dr. Jörg F. Debatin (Vorsitzender)
> Dr. Alexander Kirstein
> Ricarda Klein
> Prof. Dr. Dr. Uwe Koch-Gromus
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


More information about the Bioconductor mailing list