[BioC] how to combine microarray data and phenotype data into a least squares analysis?

James W. MacDonald jmacdon at med.umich.edu
Tue Sep 30 14:25:48 CEST 2008

Hi Martin,

Martin Bonke wrote:
> Hello everyone,
> I am looking for some help in setting up an analysis protocol in R for my
> microarray dataset. My knowledge of R is still somewhat rudimentary, but,
> having worked with it for about half a year now I do understand the basics
> and can get most of the packages that I've needed to work. However, the past
> week I've been stumped on a certain analysis that I would like to perform on
> my results.
> My dataset consists of microarrays of RNAi experiments that affect the cell
> cycle. Part of the results is a phenotypic analysis, where I have the
> percentages of cells in the different stages of the cell cycle. Now I would
> like to link this phenotypic data with the microarray data and find out
> whether the expression of genes is linked with a certain stage of the cell
> cycle. So, currently I have the matrix of all my microarray data, where the
> columns are the experiments, and the rows are the genes, and the values are
> their log-fold differences compared to wild type. I also have vectors that
> contain for each experiment the percentage of cells in a specific stage of
> the cell cycle (a vector for G1, one for G2, etc).
> Now I am quite at a loss on how to link these two together, I was suggested
> to use a least squares analysis and I've been trying make lsfit() work for
> my data, but so far without luck. The documentation with these functions
> generally is rather hard to understand for me and finding descriptive guides
> on how to do something like this has been very unsuccessful so far, probably
> because I am not really sure how this would be named. 

I doubt you want to use lsfit() for this analysis. More likely you 
simply want to fit a linear model to your data. You could use lm() for 
this (note that the help page for lsfit() even told you that you 
probably want lm()), but it would be much easier to use limma.

I'm not sure if there is an example of a 'standard' linear model in the 
limma User's Guide, but the only real difference between most of the 
examples there and what you will want to do is to use your cell cycle 
data directly rather than converting to factors first.

Read the limma User's Guide and let us know if you still have questions.



> I hope that someone out here understands what I am trying to do and perhaps
> can give me a hint or two on what I should be looking into. 
> Many thanks in advance.
> Martin
> 	[[alternative HTML version deleted]]
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

James W. MacDonald, M.S.
Hildebrandt Lab
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646

More information about the Bioconductor mailing list