[BioC] How to analyze my hg 1.1 st array
James W. MacDonald
jmacdon at uw.edu
Thu Nov 7 17:14:23 CET 2013
On Wednesday, November 06, 2013 8:22:12 PM, Jerry Cholo wrote:
> Hi everyone,
> I am new in using Oligo-Bioconductor. I do have some basic questions as
> how to analyze my hg 1.1 st array. I used following command lines:
> celFiles <- list.celfiles()
> Data <- read.celfiles(celFiles)
> ppData <- rma(Data)
> expData <- exprs(ppData);
> write.csv(expData, file = "MyData.csv");
> When I looked at the boxplots annotating ppData, and expData, I noticed
> that ppData was nicely normalized and showed a completely normal
> distribution whereas expData had huge outliers.
The only difference between the boxplots using ppData and expData is in
the first instance you were only using 10,000 rows of your expression
data, whereas in the second instance you used all the data.
> I) Which one is the output data? ppData, or expData?
I don't know what you mean by 'output data'. The ppData object is an
ExpressionSet that contains your summarized expression values, along
with other data describing the experiment, whereas expData is simply
the matrix of expression values you got from the ExpressionSet.
> II) Should I apply limma on expData or ppData?
The limma package can use either. This is covered in detail in both the
limma User's Guide, as well as in the help page for lmFit(). I would
recommend using the ExpressionSet, as it is designed specifically to
contain these sorts of data, whereas a matrix is, well, just a matrix.
> III) How could I prepare the data for limma? May I use a .csv file to
> satrt limma analysis?
Again, covered in the limma User's Guide. All Bioconductor packages
come with vignettes, which are intended to show general workflows, as
well as help pages for every function you might need to use. I would
recommend perusing both.
While you could hypothetically use a .csv file to start the limma
analysis, I can't see why you would want to. There is no profit in
reading a bunch of data into R, processing it, then writing it to disk
only to read it back in again for the next step.
The underlying principle behind Bioconductor is to give people a
coherent framework of data structures that are intended to both hold
these sorts of data, and to seamlessly allow one to process those data
without having to do a bunch of extra steps.
> [[alternative HTML version deleted]]
> Bioconductor mailing list
> Bioconductor at r-project.org
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
James W. MacDonald, M.S.
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor