[Bioc-devel] [BioC] read.phenoData vs read.AnnotatedDataFrame

Laurent Gautier lgautier at gmail.com
Thu Aug 9 08:24:58 CEST 2007


Good you bring this here to be solved.

I also remain favorable to educating users to command like "list.files" rather
than having ReadAffy doing looks-like-convenient-ouch-that-was-my-foot
things.

IMHO, an empty "filenames" should just not be accepted by the function.


L.

2007/8/9, Wolfgang Huber <huber at ebi.ac.uk>:
> Hi all,
>
> I am taking this discussion from the main list to bioc-devel. I think
> Alice has made a good point in describing a confusing behaviour of
> ReadAffy (trying to be smart) that we might want to fix.
>
> Best wishes
>    Wolfgang
>
> ------------------------------------------------------------------
> Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber
>
>
> Johnstone, Alice ha scritto:
> >  For interest sake, I have found out why I wasn't getting my expected
> > results when using read.AnnotatedDataFrame
> > Turns out the error was made in the ReadAffy command, where I specified
> > the filenames to be read from my AnnotatedDataFrame object.  There was a
> > typo error with a capital N ($FileName) rather than lowercase n
> > ($Filename) as in my target file..whoops.  However this meant the
> > filename argument was ignored without the error message(!) and instead
> > of using the information in the AnnotatedDataFrame object (which
> > included filenames, but not alphabetically) it read the .cel files in
> > alphabetical order from the working directory - hence the wrong file was
> > given the wrong label (given by the order of Annotated object) and my
> > comparisons were confused without being obvious as to why or where.
> > Our solution: specify that filename is as.character so assignment of
> > file to target is correct(after correcting $Filename) now that using
> > read.AnnotatedDataFrame rather than readphenoData.
> >
> > Data<-ReadAffy(filenames=as.character(pData(pd)$Filename),phenoData=pd)
> >
> > Hurrah!
> >
> > It may be beneficial to others, that if the filename argument isn't
> > specified, that filenames are read from the phenoData object if included
> > here.
> >
> > Thanks!
> >
> > -----Original Message-----
> > From: Martin Morgan [mailto:mtmorgan at fhcrc.org]
> > Sent: Thursday, 26 July 2007 11:49 a.m.
> > To: Johnstone, Alice
> > Cc: bioconductor at stat.math.ethz.ch
> > Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame
> >
> > Hi Alice --
> >
> > "Johnstone, Alice" <Alice.Johnstone at esr.cri.nz> writes:
> >
> >> Using R2.5.0 and Bioconductor I have been following code to analysis
> >> Affymetrix expression data: 2 treatments vs control.  The original
> >> code was run last year and used the read.phenoData command, however
> >> with the newer version I get the error message Warning messages:
> >> read.phenoData is deprecated, use read.AnnotatedDataFrame instead The
> >> phenoData class is deprecated, use AnnotatedDataFrame (with
> >> ExpressionSet) instead
> >>
> >> I use the read.AnnotatedDataFrame command, but when it comes to the
> >> end of the analysis the comparison of the treatment to the controls
> >> gets mixed up compared to what you get using the original
> >> read.phenoData ie it looks like the 3 groups get labelled wrong and so
> >
> >> the comparisons are different (but they can still be matched up).
> >> My questions are,
> >> 1) do you need to set up your target file differently when using
> >> read.AnnotatedDataFrame - what is the standard format?
> >
> > I can't quite tell where things are going wrong for you, so it would
> > help if you can narrow down where the problem occurs.  I think
> > read.AnnotatedDataFrame should be comparable to read.phenoData. Does
> >
> >> pData(pd)
> >
> > look right? What about
> >
> >> pData(Data)
> >
> > and
> >
> >> pData(eset.rma)
> >
> > ? It's not important but pData(pd)$Target is the same as pd$Target.
> > Since the analysis is on eset.rma, it probably makes sense to use the
> > pData from there to construct your design matrix
> >
> >> targs<-factor(eset.rma$Target)
> >> design<-model.matrix(~0+targs)
> >> colnames(design)<-levels(targs)
> >
> > Does design look right?
> >
> >> I have three columns sample, filename and target.
> >> 2) do you need to use a different model matrix to what I have?
> >> 3) do you use a different command for making the contrasts?
> >
> > Depends on the question! If you're performing the same analysis as last
> > year, then the model matrix and contrasts have to be the same!
> >
> >> I have included my code below if that is of any assistance.
> >> Many Thanks!
> >> Alice
> >>
> >>
> >>
> >> ##Read data
> >> pd<-read.AnnotatedDataFrame("targets.txt",header=T,row.name="sample")
> >> Data<-ReadAffy(filenames=pData(pd)$FileName,phenoData=pd)
> >> ##normalisation
> >> eset.rma<-rma(Data)
> >> ##analysis
> >> targs<-factor(pData(pd)$Target)
> >> design<-model.matrix(~0+targs)
> >> colnames(design)<-levels(targs)
> >> fit<-lmFit(eset.rma,design)
> >> cont.wt<-makeContrasts("treatment1-control","treatment2-control",level
> >> s=
> >> design)
> >> fit2<-contrasts.fit(fit,cont.wt)
> >> fit2.eb<-eBayes(fit2)
> >> testconts<-classifyTestsF(fit2.eb,p.value=0.01)
> >> topTable(fit2.eb,coef=2,n=300)
> >> topTable(fit2.eb,coef=1,n=300)
> >>
>
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>


-- 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iEYEARECAAYFAkYgwJ4ACgkQB/w/MLoyRDeQlgCeMp8v69/Wy24Q4IaBVhoG1M5R
2h4AoIOTvKbrFpTklRDjV7u8tEOeSQqt
=JPph
-----END PGP SIGNATURE-----



More information about the Bioc-devel mailing list