[BioC] Order in which ReadAffy() and read.affybatch()

Adaikalavan Ramasamy ramasamy at cancer.org.uk
Fri Mar 18 20:04:55 CET 2005


See comments below.

On Fri, 2005-03-18 at 08:26 -0800, Hrishikesh Deshmukh wrote:
> Hello All,
> 
> I have questions about the order in which ReadAffy()
> and read.affybatch() reads in affy CEL files. I need

Alphabetically, but the behaviour may vary between Windows and Linux due
to case sensitivity.

> this piece of information because i want to label the
> arrays when i look at hist() and boxplot(). I want to

This is a dangerous practice as you will be assuming that filenames are
read alphabetically. If you work on multiple OS, this might be a
nightmare.

Besides, since the filenames are used as the column names in ReadAffy
you do not need to need to care about which order it reads in the files.

raw <- ReadAffy()
head( exprs( raw ) )

       a.CEL    b.CEL    c.CEL   d.CEL
[1,]    253.8    335.8    176.5   238.3
[2,]  19607.3  19437.5  11239.5 20985.5
[3,]    218.0    275.3    169.5   263.5
[4,]  20284.5  19956.8  11324.8 21180.5
[5,]     87.5     94.8    100.3    78.5
[6,]    224.5    237.8    186.5   165.8

Then you can do a strsplit() the column names or match() it to something
else.


> make sure that right labels (filenames) are displayed
> for its corresponding lines/boxplots. 
> 
> Is there a book specifically on BioC, this would be a
> big help?
> 
> In general on what basis does one accept/reject arrays
> from a pool of replicates! The hist() and boxplot()
> shows clearly that all the arrays (replicates) do not
> show the same "behaviour".

This is before preprocessing right ? There could be systematic noises
that preprocessing algorithms can handle. I think people usually reject
on the basis of biological evidence such as housekeeping genes, RNA
degradation plots or eye-balling the chip. 


> Here are the code fragments:
> library(affy)
> library(hgu95av2cdf)
> library(hgu95av2probe)
> library(matchprobes)
> data(hgu95av2probe)
> summary(hgu95av2probe)
> file.names<-c("1.CEL",  "2.CEL",  "3.CEL",  "4.CEL", 
> "5.CEL","6.CEL","7.CEL",  "8.CEL",  "9.CEL", 
> "10.CEL", 
> "11.CEL","12.CEL","13.CEL",14.CEL","15.CEL","16.CEL","17.CEL")
> M<-read.affybatch(filenames=file.names,
> description=NULL,notes="",compress=F,   
> m.mask=F,rm.outliers=F,rm.extra=F,verbose=T)

Why not just do ReadAffy() ? It will return the filenames as column
names. 

> hist(M)
> legend(12,1.2,sampleNames(M),col=1:17,lty=1:17) 

Interesting. Why do I get a density plot when I call hist() on an
Affybatch class ?

> When i run the legend line i see hist() displays
> different "lines" and legend does not match correctly!
> 
> Thanks in advance.
> Hrishi
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>



More information about the Bioconductor mailing list