[BioC] Combining replicate spots in CGH data

Ramon Diaz-Uriarte rdiaz at cnio.es
Thu Dec 7 15:31:35 CET 2006


On Thursday 07 December 2006 13:55, João Fadista wrote:
> Dear Ramon,
>
> Thanks for the insights about the replicate spots.
>
> About the RJaCGH package, I would like to know what are the main features
> of your heterogeneous HMM algorithm. I am asking this because I would like
> to compare it with the only other heterogeneous HMM algorithm that I know
> that was made for CGH analysis.
>
> This algorithm is implemented in snapCGH package and it is called BioHMM.
> It incorporates the distance between clones into the model assigning a
> higher probability of state change to clones that are a larger distance
> apart on a chromosome.
>

We use a Bayesian model fitted with MCMC and reversible jump, and incorporate 
uncertainty via Bayesian Model Averaging. 

There are several differences with BioHMM. First, because we use MCMC, BioHMM 
is a lot faster. However, RJaCGH provides posterior probabilities of 
alteration. Also, we use reversible jump (instead of an AIC-based approach as 
in BioHMM) for dealing with the unknown number of hidden states problem. I'd 
say these are the main differences. There are also some other differences in 
how the non-homogenous part is implemented, but I'd say these are minor 
compared to the previous ones.

Further details, comparisons with BioHMM (and other methods), etc, are 
provided in the tech. report available from COBRA 
(http://biostats.bepress.com/cobra/ps/art9/) or from my web page 
(http://www.ligarto.org/rdiaz/Papers/rjhmm-report-plus-sup-mat.pdf).


Best,

R.


>
> Best regards
>
> João Fadista
> Ph.d. student
>
>
> Danish Institute of Agricultural Sciences
> Research Centre Foulum
> Dept. of Genetics and Biotechnology
> Blichers Allé 20, P.O. BOX 50
> DK-8830 Tjele
>
> Phone:   +45 8999 1900
> Direct:  +45 8999 8999
>
> E-mail:  Joao.Fadista at agrsci.dk
> Web:	   http://www.agrsci.org
>
> This email may contain information that is confidential.
> Any use or publication of this email without written permission from DIAS
> is not allowed. If you are not the intended recipient, please notify DIAS
> immediately and delete this email.
>
>
>
>
>
> -----Original Message-----
> From: Ramon Diaz-Uriarte [mailto:rdiaz at cnio.es]
> Sent: Thursday, December 07, 2006 12:18 PM
> To: bioconductor at stat.math.ethz.ch
> Cc: João Fadista
> Subject: Re: [BioC] Combining replicate spots in CGH data
>
> On Wednesday 06 December 2006 17:12, João Fadista wrote:
> > Dear all,
> >
> > I was wondering if there are other methods for combining replicate
> > spots other than the average or the median. I am asking this in
> > concern with CGH data analysis because I do not know how, and if, we
> > can take advantage of the genomic structure of the array CGH data for
> > combining replicate spots.
> >
> > For the sake of the argument I put below two hypothetical examples:
> > - Combining replicate spots in a different way depending on what
> > region of the chromosome or genome they are; - Or give more weight to
> > spots that we know that have more reliability.
> >
> > Something like this if you know what I mean.
>
> Dear Joao,
>
> This is nothing ellaborate; just a couple of thoughts.
>
> 1. I assume you mean true replicate spots. In other words, these are the
> exact same DNA piece, and they map to exactly the same locations in the
> chromosome.
>
> 2. Ideally, I'd like a method that can deal with replicate spots without
> even asking you to take the mean or the median. One problem I find with
> means or medians is that, if you do not have the exact same number of
> replicates for all locations, then you are estimating a value that has
> different variances over different locations.
>
> I think (non-homogeneous) HMMs and related techniques are suited for
> dealing with arbitrary (and different) number of replicate spots: at
> location "t" you happen to have more than one observation, and you are
> fitting a model where those observed log ratios come from an emission
> function, blablabla. By not taking means/medians/whatever, you do not
> violate assumptions related to the variance of the emission functions. In
> other words, conditional on being on state "k" you are log ratios are, say,
> ~ N(mu, sigma).
>
>
> (I'll admit we have a "hidden agenda", with our RJaCGH package :-).
>
> R.
>
> > Best regards
> >
> > João Fadista
> > Ph.d. student
> >
> >
> >
> >  	 Danish Institute of Agricultural Sciences Research Centre Foulum
> > Dept. of Genetics and Biotechnology Blichers Allé 20, P.O. BOX 50
> > DK-8830 Tjele
> >
> > Phone:	 +45 8999 1900
> > Direct:	 +45 8999 8999
> > E-mail:	 Joao.Fadista at agrsci.dk <mailto:Joao.Fadista at agrsci.dk>
> > Web:	 www.agrsci.org <http://www.agrsci.org/>
> > ________________________________
> >
> > News and news media <http://www.agrsci.org/navigation/nyheder_og_presse>
> > .
> >
> > This email may contain information that is confidential. Any use or
> > publication of this email without written permission from DIAS is not
> > allowed. If you are not the intended recipient, please notify DIAS
> > immediately and delete this email.
> >
> >
> > 	[[alternative HTML version deleted]]
>
> --
> Ramón Díaz-Uriarte
> Bioinformatics
> Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National
> Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain)
> Fax: +-34-91-224-6972
> Phone: +-34-91-224-6900
>
> http://ligarto.org/rdiaz
> PGP KeyID: 0xE89B3462
> (http://ligarto.org/rdiaz/0xE89B3462.asc)
>
>
>
> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los
> ficheros adjuntos, pueden contener información protegida para el uso
> exclusivo de su destinatario. Se prohíbe la distribución, reproducción o
> cualquier otro tipo de transmisión por parte de otra persona que no sea el
> destinatario. Si usted recibe por error este correo, se ruega comunicarlo
> al remitente y borrar el mensaje recibido. **CONFIDENTIALITY NOTICE** This
> email communication and any attachments may contain confidential and
> privileged information for the sole use of the designated recipient named
> above. Distribution, reproduction or any other use of this transmission by
> any party other than the intended recipient is prohibited. If you are not
> the intended recipient please contact the sender and delete all copies.

-- 
Ramón Díaz-Uriarte
Bioinformatics 
Centro Nacional de Investigaciones Oncológicas (CNIO)
(Spanish National Cancer Center)
Melchor Fernández Almagro, 3
28029 Madrid (Spain)
Fax: +-34-91-224-6972
Phone: +-34-91-224-6900

http://ligarto.org/rdiaz
PGP KeyID: 0xE89B3462
(http://ligarto.org/rdiaz/0xE89B3462.asc)



**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en s...{{dropped}}



More information about the Bioconductor mailing list