[BioC] Re : Cox Model
J.J.Goeman at lumc.nl
J.J.Goeman at lumc.nl
Thu Feb 14 12:10:14 CET 2008
Dear Eleni,
If you are interested in prediction of survival with (a subset of) your 18000 genes, you may want to have a look at the "penalized" package on CRAN (http://cran.us.r-project.org/src/contrib/Descriptions/penalized.html) or other packages there that do penalized estimation.
Jelle
> -----Original Message-----
> From: Eleni Christodoulou [mailto:elenichri at gmail.com]
> Sent: 13 February 2008 13:22
> To: phguardiol at aol.com
> Cc: rdiaz at cnio.es; bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] Re : Cox Model
>
> Hi,
>
> Thanks for the replies. I will probably try to perform
> survival analysis on each of the genes to get gene-wise
> p-values and then select the most significant (the ones that
> are below a certain p-value) and proceed to a full cox
> regression using the significant genes. Do you think that
> this makes sense?
>
> Thanks a lot,
> Eleni
>
> On Feb 13, 2008 2:11 PM, <phguardiol at aol.com> wrote:
>
> > Hi,
> > wouldnt it make sense to first have data reduction dimensionality
> > before undergoing such survival analysis ? Certainly, some of your
> > genes have similar expression profiles across samples...?
> > Best,
> > Philippe Guardiola
> >
> >
> > -----E-mail d'origine-----
> > De : Ramon Diaz-Uriarte <rdiaz at cnio.es> A :
> > bioconductor at stat.math.ethz.ch Cc : Eleni Christodoulou
> > <elenichri at gmail.com> Envoyé le : Me, 13 Février 2008 11:23 Sujet :
> > Re: [BioC] Cox Model
> >
> > Dear Eleni,
> >
> >
> > You are trying to fit a model with 18000 covariates but only 80
> > samples (of
> >
> > which, at most, only 80 are not censored). Just doing it
> the way you
> > are
> >
> > trying to do it is unlikely to work or make much sense...
> >
> >
> > You might want to take a look at the work of Torsten Hothorn and
> > colleagues on
> >
> > survival ensembles, with implementations in the R package
> mboost, and
> > their
> >
> > work on random forests for survival data (see R package
> party). Some
> > of this
> >
> > funcionality is also accessible through our web-based tool SignS
> >
> > (http://signs.bioinfo.cnio.es), which uses the above packages.
> >
> >
> > Depending on your exact question, you might also want to
> look at the
> > approach
> >
> > of Jelle Goeman, for testing whether sets of genes (e.g.,
> you complete
> > 18000
> >
> > set of genes) are related to the outcome of interest
> (survival in your case).
> >
> > Goeman's approach is available in the globaltest package from BioC.
> >
> >
> > Hope this helps,
> >
> >
> > R.
> >
> >
> >
> > On Wednesday 13 February 2008 08:10, Eleni Christodoulou wrote:
> >
> > > Hello BioC-community,
> >
> > >
> >
> > > It's been a week now that I am struggling with the
> implementation of
> > > a cox
> >
> > > model in R. I have 80 cancer patients, so 80 time
> measurements and
> > > 80
> >
> > > relapse or no measurements (respective to censor, 1 if
> relapsed over
> > > the
> >
> > > examined period, 0 if not). My microarray data contain
> around 18000 genes.
> >
> > > So I have the expressions of 18000 genes in each of the 80 tumors
> > > (matrix
> >
> > > 80*18000). I would like to build a cox model in order to retrieve
> > > the most
> >
> > > significant genes (according to the p-value). The command
> that I am
> > > using
> >
> > > is:
> >
> > >
> >
> > > test1 <- list(time,relapse,genes)
> >
> > > coxph( Surv(time, relapse) ~ genes, test1)
> >
> > >
> >
> > > where time is a vector of size 80 containing the times,
> relapse is a
> > > vector
> >
> > > of size 80 containing the relapse values and genes is a
> matrix 80*18000.
> >
> > > When I give the coxph command I retrieve an error saying
> that cannot
> >
> > > allocate vector of size 2.7Mb (in Windows). I also tried
> linux and
> > > then I
> >
> > > receive error that maximum memory is reached. I increase
> the memory
> > > by
> >
> > > initializing R with the command:
> >
> > > R --min-vsize=10M --max-vsize=250M --min-nsize=1M --max-nsize=200M
> >
> > >
> >
> > > I think it cannot get better than that because if I try
> for example
> >
> > > max-vsize=300 the memomry capacity is stored as NA.
> >
> > >
> >
> > > Does anyone have any idea why this happens and how I can
> overcome it?
> >
> > >
> >
> > > I would be really grateful if you could help!
> >
> > > It has been bothering me a lot!
> >
> > >
> >
> > > Thank you all,
> >
> > > Eleni
> >
> > >
> >
> > > [[alternative HTML version deleted]]
> >
> > >
> >
> > > _______________________________________________
> >
> > > Bioconductor mailing list
> >
> > > Bioconductor at stat.math.ethz.ch
> >
> > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> >
> > > Search the archives:
> >
> > > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> >
> > --
> >
> > Ramón Díaz-Uriarte
> >
> > Statistical Computing Team
> >
> > Centro Nacional de Investigaciones Oncológicas (CNIO)
> >
> > (Spanish National Cancer Center)
> >
> > Melchor Fernández Almagro, 3
> >
> > 28029 Madrid (Spain)
> >
> > Fax: +-34-91-224-6972
> >
> > Phone: +-34-91-224-6900
> >
> > http://ligarto.org/rdiaz
> >
> > PGP KeyID: 0xE89B3462
> >
> > (http://ligarto.org/rdiaz/0xE89B3462.asc)
> >
> >
> >
> >
> > **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y
> > ...{{dropped:3}}
> >
> >
> > _______________________________________________
> >
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> >
> > Search the archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> >
>
> [[alternative HTML version deleted]]
>
>
>
More information about the Bioconductor
mailing list