[BioC] Reading Agilent files using read.maimages
Gaj Stan (BIGCAT)
Stan.Gaj at BIGCAT.unimaas.nl
Wed Aug 15 15:49:44 CEST 2007
Hi Steve,
It would seem that most of the columns that you have are also present in
an Agilent output file (Except for the "H_C1 A_mock A and B/"-part).
You apparently have all the columns necessary to do the analysis: (I'll
remove the extra header part to keep a more clear overview, perhaps it's
a good idea to remove these parts from your files as well)
- r/gMeanSignal (Mean Signal intensity for red/green channel)
- r/gBGMeanSignal (Mean Background Signal Intensity for red/green
channel)
- r/gMedianSignal (Median Signal r/g)
- r/gBGMedianSignal (Median Background Signal r/g)
- r/gBGUsed (Estimated background using a specific algorithm called
spatial detrending. This value is usually lower compared to
r/gBGMe(di)an signal, and should be used during the background
subtraction (if you intend to use that), in my humble opinion.
So what I usually do then is the following:
# Read the Target file (experimental description) - See Limma
user guide for more information on this
targets <- readTargets("description.txt", sep="\t", quote="\"")
# Reading the Images
Agilent.RG <- read.maimages(targets$FileName, source="agilent",
path=datapath,
names=targets$Description, columns= list( R = "rMeanSignal", G =
"gMeanSignal", Rb = "rBGUsed", Gb = "gBGUsed", Rb.real =
"rBGMeanSignal", Gb.real = "gBGMeanSignal"), annotation =
c("FeatureNum","Row","Col","ProbeName","ControlType","GeneName",
"Description","SystematicName"))
This way you can check both background values (estimated (Agilent.RG$Rb,
Agilent.RG$Gb) vs really measured (Agilent.RG$Rb.real,
Agilent.RG$Gb.real)) during my quality control checks.
As Sean mentioned, the normalization used for the r/gProcessedSignals is
dependant on the Scanner type and software used for image conversion,
but if the original Feature Annotation Software (and Agilent Scanner)
has been used, then I would think in the direction of a LOESS algorithm
(for within-array normalisation) followed by scaling to a reference
value.
If I recall correctly (but I keep forgetting the minor details), then
the Processed signal (i.e. for red) is calculated using
( rMeanSignal - rBGUsed ) --> corrected through LOESS Normalization -->
Scaling --> Processed Value. I think each Feature Extraction Software
comes with a built-in manual where these procedures are more clearly
explained. I would suggest reading that.
The ratio between rProcessedSignal and gProcessedSignal is then
calculated and transformed into a log10-scale! (Note: Bioconductor often
uses a log2 scale, so don't compare the Agilent LogRatio directly with
the Ratio (LogOdds) you will get while using for instance the limma
package in R.
I hope that this clarifies a bit.
-- Stan
-----Original Message-----
From: bioconductor-bounces at stat.math.ethz.ch
[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Steve
Taylor
Sent: 15 August 2007 10:03
To: Sean Davis
Cc: Bioconductor
Subject: Re: [BioC] Reading Agilent files using read.maimages
Hi Sean,
>>A typical header for one of these raw files is
>>
>>CompositeSequence Identifier Database ebi.ac.uk:Database:embl
Database ebi.ac.uk:Database:ensembl Database
ebi.ac.uk:Database:locus Database ebi.ac.uk:Database:refseq
Database
>>ebi.ac.uk:Database:tigr_thc Database
www.chem.agilent.com:Database:agc Database
www.chem.agilent.com:Database:agp Feature coordinates: metaColumn
metaRow column row Reporter control
>>type Reporter group Reporter identifier Reporter name
Reporter sequence type H_C1 A_mock A and B/FEATURES H_C1 A_mock A
and B/FeatureNum H_C1 A_mock A and B/gbpri H_C1 A_mock A and
>>B/gp H_C1 A_mock A and B/sp H_C1 A_mock A and B/ProbeUID H_C1
A_mock A and B/ControlType H_C1 A_mock A and B/ProbeName H_C1 A_mock A
and B/GeneName H_C1 A_mock A and B/SystematicName
>>H_C1 A_mock A and B/Description H_C1 A_mock A and B/LogRatio H_C1
A_mock A and B/LogRatioError H_C1 A_mock A and B/PValueLogRatio
H_C1 A_mock A and B/gSurrogateUsed H_C1 A_mock A
>>and B/rSurrogateUsed H_C1 A_mock A and B/gIsFound H_C1 A_mock
A and B/rIsFound H_C1 A_mock A and B/gProcessedSignal H_C1 A_mock
A and B/rProcessedSignal H_C1 A_mock A and
>>B/gProcessedSigError H_C1 A_mock A and B/rProcessedSigError H_C1
A_mock A and B/gNumPixOLHi H_C1 A_mock A and B/rNumPixOLHi H_C1 A_mock A
and B/gNumPixOLLo H_C1 A_mock A and B/rNumPixOLLo H_C1
>>A_mock A and B/gNumPix H_C1 A_mock A and B/rNumPix H_C1 A_mock
A and B/gMeanSignal H_C1 A_mock A and B/rMeanSignal H_C1 A_mock A and
B/gMedianSignal H_C1 A_mock A and B/rMedianSignal
>> H_C1 A_mock A and B/gPixSDev H_C1 A_mock A and B/rPixSDev
H_C1 A_mock A and B/gBGNumPix H_C1 A_mock A and B/rBGNumPix H_C1
A_mock A and B/gBGMeanSignal H_C1 A_mock A and
>>B/rBGMeanSignal H_C1 A_mock A and B/gBGMedianSignal H_C1
A_mock A and B/rBGMedianSignal H_C1 A_mock A and B/gBGPixSDev H_C1
A_mock A and B/rBGPixSDev H_C1 A_mock A and B/gNumSatPix
>>H_C1 A_mock A and B/rNumSatPix H_C1 A_mock A and B/gIsSaturated
H_C1 A_mock A and B/rIsSaturated H_C1 A_mock A and
B/PixCorrelation H_C1 A_mock A and B/BGPixCorrelation H_C1
>>A_mock A and B/gIsFeatNonUnifOL H_C1 A_mock A and
B/rIsFeatNonUnifOL H_C1 A_mock A and B/gIsBGNonUnifOL H_C1
A_mock A and B/rIsBGNonUnifOL H_C1 A_mock A and B/gIsFeatPopnOL
H_C1
>>A_mock A and B/rIsFeatPopnOL H_C1 A_mock A and B/gIsBGPopnOL
H_C1 A_mock A and B/rIsBGPopnOL H_C1 A_mock A and B/IsManualFlag
H_C1 A_mock A and B/gBGSubSignal H_C1 A_mock A and
>>B/rBGSubSignal H_C1 A_mock A and B/gBGSubSigError H_C1
A_mock A and B/rBGSubSigError H_C1 A_mock A and
B/BGSubSigCorrelation H_C1 A_mock A and B/gIsPosAndSignif H_C1
A_mock A and
>>B/rIsPosAndSignif H_C1 A_mock A and B/gPValFeatEqBG H_C1
A_mock A and B/rPValFeatEqBG H_C1 A_mock A and B/gNumBGUsed H_C1
A_mock A and B/rNumBGUsed H_C1 A_mock A and B/gIsWellAboveBG
>> H_C1 A_mock A and B/rIsWellAboveBG H_C1 A_mock A and
B/IsUsedBGAdjust H_C1 A_mock A and B/gBGUsed H_C1 A_mock A and
B/rBGUsed H_C1 A_mock A and B/gBGSDUsed H_C1 A_mock A and
>>B/rBGSDUsed H_C1 A_mock A and B/IsNormalization H_C1 A_mock A
and B/gDyeNormSignal H_C1 A_mock A and B/rDyeNormSignal H_C1
A_mock A and B/gDyeNormError H_C1 A_mock A and
>>B/rDyeNormError H_C1 A_mock A and B/DyeNormCorrelation H_C1
A_mock A and B/ErrorModel
>
>
> This is not an Agilent Raw Data file, I do not think. The column
names
> are similar, but ArrayExpress has significantly changed the file from
> its original format. That said, the columns with "LogRatio",
> "rProcessedSignal" and "gProcessedSignal" are the columns of interest
> that have already been background corrected and, typically, a
> normalization method applied (not sure which one without some more
> description of the scanner settings).
>
>
>Ok. Thanks. That's useful information. In the protocols section of AE
it says 'Default settings'
>(http://www.ebi.ac.uk/aerep/details?class=MAGE.Experiment_protocols&cri
teria=Experiment%>3D921408317&contextClass=MAGE.Protocol&templateName=Pr
otocol.vm). If that means it has been normalised I will
>have a look at LogRatio, rProcessedSignal and gProcessedSignal, though
it would be nice to know how it had been processed...
>
>
>>Does this look correct? How do I get access to the intensities, for
example to do a boxplot?
>
>
> I'm not sure if the files loaded correctly, given my comments above.
> RG$R and RG$G contain the Red and Green intensities, if it loaded
correctly.
>
That's what I thought. Thanks for the advice,
Steve
_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list