[Bioc-devel] [BioC] read.phenoData vs read.AnnotatedDataFrame

Thu Aug 9 03:12:52 CEST 2007

Hi all,

I am taking this discussion from the main list to bioc-devel. I think 
Alice has made a good point in describing a confusing behaviour of 
ReadAffy (trying to be smart) that we might want to fix.

Best wishes
   Wolfgang

------------------------------------------------------------------
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber

Johnstone, Alice ha scritto:
>  For interest sake, I have found out why I wasn't getting my expected
> results when using read.AnnotatedDataFrame
> Turns out the error was made in the ReadAffy command, where I specified
> the filenames to be read from my AnnotatedDataFrame object.  There was a
> typo error with a capital N ($FileName) rather than lowercase n
> ($Filename) as in my target file..whoops.  However this meant the
> filename argument was ignored without the error message(!) and instead
> of using the information in the AnnotatedDataFrame object (which
> included filenames, but not alphabetically) it read the .cel files in
> alphabetical order from the working directory - hence the wrong file was
> given the wrong label (given by the order of Annotated object) and my
> comparisons were confused without being obvious as to why or where.
> Our solution: specify that filename is as.character so assignment of
> file to target is correct(after correcting $Filename) now that using
> read.AnnotatedDataFrame rather than readphenoData.
> 
> Data<-ReadAffy(filenames=as.character(pData(pd)$Filename),phenoData=pd)
> 
> Hurrah!
> 
> It may be beneficial to others, that if the filename argument isn't
> specified, that filenames are read from the phenoData object if included
> here.
> 
> Thanks!
> 
> -----Original Message-----
> From: Martin Morgan [mailto:mtmorgan at fhcrc.org] 
> Sent: Thursday, 26 July 2007 11:49 a.m.
> To: Johnstone, Alice
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame
> 
> Hi Alice --
> 
> "Johnstone, Alice" <Alice.Johnstone at esr.cri.nz> writes:
> 
>> Using R2.5.0 and Bioconductor I have been following code to analysis 
>> Affymetrix expression data: 2 treatments vs control.  The original 
>> code was run last year and used the read.phenoData command, however 
>> with the newer version I get the error message Warning messages:
>> read.phenoData is deprecated, use read.AnnotatedDataFrame instead The 
>> phenoData class is deprecated, use AnnotatedDataFrame (with
>> ExpressionSet) instead
>>  
>> I use the read.AnnotatedDataFrame command, but when it comes to the 
>> end of the analysis the comparison of the treatment to the controls 
>> gets mixed up compared to what you get using the original 
>> read.phenoData ie it looks like the 3 groups get labelled wrong and so
> 
>> the comparisons are different (but they can still be matched up).
>> My questions are,
>> 1) do you need to set up your target file differently when using 
>> read.AnnotatedDataFrame - what is the standard format?
> 
> I can't quite tell where things are going wrong for you, so it would
> help if you can narrow down where the problem occurs.  I think
> read.AnnotatedDataFrame should be comparable to read.phenoData. Does
> 
>> pData(pd)
> 
> look right? What about
> 
>> pData(Data)
> 
> and
> 
>> pData(eset.rma)
> 
> ? It's not important but pData(pd)$Target is the same as pd$Target.
> Since the analysis is on eset.rma, it probably makes sense to use the
> pData from there to construct your design matrix
> 
>> targs<-factor(eset.rma$Target)
>> design<-model.matrix(~0+targs)
>> colnames(design)<-levels(targs)
> 
> Does design look right?
> 
>> I have three columns sample, filename and target.
>> 2) do you need to use a different model matrix to what I have?  
>> 3) do you use a different command for making the contrasts?
> 
> Depends on the question! If you're performing the same analysis as last
> year, then the model matrix and contrasts have to be the same!
> 
>> I have included my code below if that is of any assistance.
>> Many Thanks!
>> Alice
>>  
>>  
>>  
>> ##Read data
>> pd<-read.AnnotatedDataFrame("targets.txt",header=T,row.name="sample")
>> Data<-ReadAffy(filenames=pData(pd)$FileName,phenoData=pd)
>> ##normalisation
>> eset.rma<-rma(Data)
>> ##analysis
>> targs<-factor(pData(pd)$Target)
>> design<-model.matrix(~0+targs)
>> colnames(design)<-levels(targs)
>> fit<-lmFit(eset.rma,design)
>> cont.wt<-makeContrasts("treatment1-control","treatment2-control",level
>> s=
>> design)
>> fit2<-contrasts.fit(fit,cont.wt)
>> fit2.eb<-eBayes(fit2)
>> testconts<-classifyTestsF(fit2.eb,p.value=0.01)
>> topTable(fit2.eb,coef=2,n=300)
>> topTable(fit2.eb,coef=1,n=300)
>>