[BioC] Affy PhenoData

James W. MacDonald jmacdon at uw.edu
Wed Apr 18 20:50:15 CEST 2012


Hi Himanshu,

On 4/18/2012 12:27 PM, hsharm03 at students.poly.edu wrote:
> Dear all,
> I have 4 samples of HT 430mgpm array plate. Three of them are replicates of wild types and 1 is a tumor condition with no replicates. But I do not have any other information regarding how was the experiment conducted and I am not able to figure out how to create a phenodata of the same. When I create the expression set of this data using rma function of affy library I can see the names of the samples as they were and sample numbers namely 1,2,3,4. What I understand is it takes alll the four samples as different and when I do the differential expression analysis using limma , I try to create the model.matrix using the following command :
>
> design<- model.matrix(~sample, pData(eset))
>
> But what I understand is that the sample that are present in the data it is taking 1 condition each of 4 samples. Am I understanding it correctly?.

Yes, you are understanding it correctly. But this leads me to a separate 
point, below.

> If so what should I be doing to get differential expression of genes in tumor as compare to the 3 wild type replicates that I have .

The simple answer is that you should use the correct input to 
model.matrix(), designed for your experiment. I realize that is a vague 
and wholly unsatisfying answer, but we have arrived at the point for you 
that occurs for all long term R users, when they either decide to figure 
stuff out themselves or they become disillusioned and give up.

It would be simple for me to tell you exactly what you need to do for 
this step in your analysis. And then answer the next question, and the 
one after that. But that helps no one.

If you are really going to analyze your own data (not recommended IMO, 
but that's the beauty of Open Source software - we get both the rope and 
the tree, and are free to hang ourselves) you will have to learn how to 
figure out both what you should be doing, and how to do it.

So the best advice I can offer is to recommend that you find a local 
statistician to help with your analysis. Barring that, you should 
closely read the 'limma User's guide', probably the 'Introduction to R', 
certainly look at ?formula, ?model.matrix, and ?factor. The Bioconductor 
Case Studies contain many useful examples. There are also any number of 
presentations given over the years that you could find via the BioC 
website, or with good google skills.

Best,

Jim


>
> I am very new to this field and so I am not sure how to proceed
> Any help will be much appreciated.
> Thanks ,
> Himanshu Sharma.
>   		 	   		
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list