[BioC] File format for single channel analysis of Agilent microarray data with Limma?
Gordon K Smyth
smyth at wehi.EDU.AU
Mon May 28 11:51:29 CEST 2012
Dear Parisa,
The problem is not to do with the file format. The problem is almost
certainly that you are trying to read data files from different GEO series
that contain different numbers of rows (probes), and read.maimages() does
not allow you to do that.
As the help page for read.maimages says, "All image analysis files being
read are assumed to contain data for the same genelist in the same order."
Does it make sense to combine the different GEO series? Did they all use
exactly the same Agilent array? If it does make sense, but the data files
contain data for different sets of probe, then it is up to you read the
series into R separately, then to make decisions about which probes can be
matched up across series and which cannot.
Best wishes
Gordon
-------------- original message -----------------
[BioC] File format for single channel analysis of Agilent microarray data
with Limma?
Parisa Razaz Parisa.Razaz at icr.ac.uk
Sun May 27 20:09:13 CEST 2012
Hi Guido,
Thank you for getting back to me. I am also using data downloaded from GEO
and have now incorporated your suggestion of "agilent.median" when using
the read.maimages function. However the problem now appears to be with
loading files from different series (when using the read.maimages
function). Particular combinations of series work and others don't, with
those that don't giving the same error message as before. I thought that
this may be a size limit issue, but the combined number of samples for
some of the series that don't work together is smaller at times than those
that do. Do you have any idea why this might be and how I would get around
it?
Thanks,
Parisa
________________________________________
From: Hooiveld, Guido [Guido.Hooiveld at wur.nl]
Sent: 23 May 2012 16:52
To: bioconductor at r-project.org; Parisa Razaz
Subject: RE: [BioC] File format for single channel analysis of Agilent
microarray data with Limma?
Hi Parisa,
I also once struggled with reading in some Agilent singe channel arrays
(that I downloaded from GEO; GSE27784), but for me these line of codes
worked (in particularly note that the 2nd line is different than the one
that is given on the website you linked to; specifically the statement
source="agilent.median"):
HTH,
Guido
>
> targets <- readTargets("targets_GSE27784.txt", row.names="Name")
> e.raw <- read.maimages(targets$FileName, source="agilent.median",
green.only=TRUE)
Read GSM686624_251486829200_S01_GE1_105_Jan09_1_1.txt
Read GSM686625_251486829201_S01_GE1_105_Jan09_1_2.txt
Read GSM686626_251486829328_S01_GE1_105_Jan09_1_3.txt
Read GSM686627_251486829200_S01_GE1_105_Jan09_1_2.txt
Read GSM686628_251486829200_S01_GE1_105_Jan09_1_4.txt
Read GSM686629_251486829201_S01_GE1_105_Jan09_1_4.txt
Read GSM686630_251486829328_S01_GE1_105_Jan09_1_4.txt
Read GSM686631_251486829328_S01_GE1_105_Jan09_1_1.txt
Read GSM686632_251486829328_S01_GE1_105_Jan09_1_2.txt
Read GSM686633_251486829200_S01_GE1_105_Jan09_1_3.txt
Read GSM686634_251486829201_S01_GE1_105_Jan09_1_3.txt
Read GSM686635_251486829201_S01_GE1_105_Jan09_1_1.txt
>
> #Background correction using normexp + offset
> e.raw2 <- backgroundCorrect(e.raw, method="normexp", offset=50)
Array 1 corrected
Array 2 corrected
Array 3 corrected
Array 4 corrected
Array 5 corrected
Array 6 corrected
Array 7 corrected
Array 8 corrected
Array 9 corrected
Array 10 corrected
Array 11 corrected
Array 12 corrected
>
> # Perform quantile normalization
> expr.data <- normalizeBetweenArrays(e.raw2, method="quantile")
>
> #Use the avereps function to average replicate spots.
> E.avg <- avereps(expr.data, ID=expr.data$genes$ProbeName)
>
>
> # Alternatively, perform background correction using the negative
control probes + quantile normalization
> table(e.raw$genes$ControlType)
-1 0 1
153 43379 1486
> bg.corr <- neqc(e.raw, status=e.raw$genes$ControlType, negctrl=-1,
regular=0)
>
> E.avg <- avereps(bg.corr, ID=bg.corr$genes$ProbeName)
>
---------------------------------------------------------
Guido Hooiveld, PhD
Nutrition, Metabolism & Genomics Group
Division of Human Nutrition
Wageningen University
Biotechnion, Bomenweg 2
NL-6703 HD Wageningen
the Netherlands
tel: (+)31 317 485788
fax: (+)31 317 483342
email: guido.hooiveld at wur.nl
internet: http://nutrigene.4t.com
http://scholar.google.com/citations?user=qFHaMnoAAAAJ
http://www.researcherid.com/rid/F-4912-2010
-----Original Message-----
From: bioconductor-bounces at r-project.org [mailto:bioconductor-bounces
at r-project.org] On Behalf Of Parisa [guest]
Sent: Wednesday, May 23, 2012 15:51
To: bioconductor at r-project.org; parisa.razaz at icr.ac.uk
Subject: [BioC] File format for single channel analysis of Agilent
microarray data with Limma?
Hi,
I am following the protocol outlined here for analysis of single channel
Agilent microarray data:
http://matticklab.com/index.php?title=Single_channel_analysis_of_Agilent_microarray_data_with_Limma
I keep getting the following error message when using Limma's
read.maimages function to load my data into an RGList object:
Error in RG[[a]][, i] <- obj[, columns[[a]]] :
number of items to replace is not a multiple of replacement length
I think this may be due to my Agilent raw data txt files being in the
wrong format. I am having difficulty finding an example Agilent feature
extraction raw data txt file online to compare it to. A link to a screen
shot of one of the files I am using is below. I would appreciate if
someone could let me know if it is in the correct format, and if not then
what format it should be in to prevent the above error message from coming
up.
Thank you,
Parisa
http://www4.picturepush.com/photo/a/8322602/img/8322602.png
-- output of sessionInfo():
> sessionInfo()R version 2.13.1 (2011-07-08)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] limma_3.8.3
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list