[BioC] arrayQualityMetrics problem with Agilent 2-colour

Audrey Kauffmann ak.bergonie at gmail.com
Fri Jun 25 17:59:12 CEST 2010


Hi Edwin,

To run the outlier detection on your distance matrix the same way it
is done in the arrayQualityMetrics package (1.5*IQR), you need to
compute the sum of the distances by sample and then use the function
boxplot.stats with the default parameters. The outliers will be listed
in the slot "out" from the object created with boxplot.stats.

However I am not sure it would be as efficient with Pearson
correlation as it is with the L1 distance that we are using in
arrayQualityMetrics. It is worth comparing the two I guess.

I hope that helps,
Audrey

2010/6/24 Edwin Groot <edwin.groot at biologie.uni-freiburg.de>:
> Thanks for the reply Audrey
>
> On Wed, 23 Jun 2010 16:51:08 +0200
>  Audrey Kauffmann <ak.bergonie at gmail.com> wrote:
>> Hi Edwin,
>>
>> arrayQualityMetrics is not really appropriate for that, maybe you can
>> consider using the clustering and heatmap functions outside from the
>> package.
>>
>
> This might be the best solution for the time being. I also tried the
> arrayQuality package, but it keeps asking me for my mouse annotations.
> Unfortunately, I work with neither mice nor humans, but with
> Arabidopsis (those green thingies that photosynthesize).
> Some searching on the Internet (sorry, I lost the source) gave the tip
> of using Pearson correlation for distance measures of arrays. I came up
> with the following code snippet, but I do not know if this is an
> appropriate distance measure for RAW intensity values. The Pearson
> correlation is 0 to 1, with 1 being perfect correlation between a pair
> of arrays.
> What additional code is needed to determine outliers by setting a 1.5 *
> IQR threshold? I have to do something with object d, correct?
> #Generate correlation matrix for green and red channels
>  allGR <- cbind(RG$G,RG$R)
>  d <- cor(allGR, method="pearson")
> #Plot dendrogram of distances
>  plot(hclust(dist(1-d)))
> #Plot heatmap of all samples against all
>  heatmap(d, Rowv=NA, Colv=NA, symm=TRUE)
>
>
>> Also, regarding using aqm.prepdata on a RGList is not implemented,
>> you
>> would first need to convert your RGList into a NChannelSet.
>>
>
> I have tried this conversion, but I received an error from aqm.heatmap.
>  RGNC <- as(RG, "NChannelSet")
>  RGprep = aqm.prepdata(expressionset=RGNC, do.logtransform = TRUE)
>  hm <- aqm.heatmap(dataprep=RGNC)
> Error in if (n < 2) stop("must have n >= 2 objects to cluster") :
> argument is of length zero
>
> That is too bad. The RGprep object has slots for number of channels and
> for the sample names.
>
>
>> A trick, not recommended but which would work, is to convert your
>> RGList as an ExpressionSet (usually for one channel) where each R and
>> G channels would be treated as separated chips, then use aqm.prepdata
>> and aqm.heatmap.
>>
>
> Another reason that I have read against this option is the raw
> intensity I read into my RGList is not expression, and therefore not
> appropriate for an ExpressionSet. I might open a semantic can of worms,
> but I think the 2-colour specialists consider the data as expression
> only after background subtraction, normalization and output of the
> log-ratios.
>
> Edwin
> --
>> Please let me know if you have more questions regarding this,
>> Audrey
>>
>> 2010/6/23 Edwin Groot <edwin.groot at biologie.uni-freiburg.de>:
>> > Hello all,
>> > Would anyone know of a simple way to plot the heatmap in
>> > arrayQualityMetrics on a per-channel basis, rather than the
>> per-array
>> > basis?
>> >
>> > I have 16 2-colour Agilent arrays read in by the LIMMA package, but
>> > arrayQualityMetrics shows only 16 arrays in the heatmap, rather
>> than
>> > the 32 red and green samples that I would expect. When one array is
>> an
>> > outlier in the heatmap, I do not know which sample (red or green)
>> is at
>> > fault.
>> >
>> > The code that I used is quite simple:
>> >> library(limma)
>> >> library(arrayQualityMetrics)
>> >> targets <- readTargets("qc.exp")
>> >> RG <- read.maimages(targets$FileName, source="agilent")
>> >> arrayQualityMetrics(expressionset=RG, outdir="default", force =
>> TRUE,
>> > do.logtransform = TRUE)
>> > The directory 'default' has been created.
>> > KernSmooth 2.23 loaded
>> > Copyright M. P. Wand 1997-2009
>> > (loaded the KernSmooth namespace)
>> > [[1]]
>> > [[2]]
>> >
>> > I also attempted a custom call to aqm.* functions. However, the
>> > preparatory step fails:
>> >> RGprep = aqm.prepdata(expressionset=RG, do.logtransform = TRUE)
>> > Error in function (classes, fdef, mtable)  :
>> >  unable to find an inherited method for function "aqm.prepdata",
>> for
>> > signature "RGList"
>> >
>> >> sessionInfo()
>> > R version 2.11.1 (2010-05-31)
>> > i486-pc-linux-gnu
>> >
>> > locale:
>> >  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>> >  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>> >  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>> >  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>> >  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>> >
>> > attached base packages:
>> > [1] stats     graphics  grDevices utils     datasets  methods
>> base
>> >
>> >
>> > other attached packages:
>> > [1] arrayQualityMetrics_2.6.0 affyPLM_1.24.0
>> > [3] preprocessCore_1.10.0     gcrma_2.20.0
>> > [5] affy_1.26.1               Biobase_2.8.0
>> > [7] limma_3.4.2
>> >
>> > loaded via a namespace (and not attached):
>> >  [1] affyio_1.16.0        annotate_1.26.0      AnnotationDbi_1.10.1
>> >  [4] beadarray_1.16.0     Biostrings_2.16.3    DBI_0.2-5
>> >  [7] genefilter_1.30.0    grid_2.11.1          hwriter_1.2
>> > [10] IRanges_1.6.4        lattice_0.18-8       latticeExtra_0.6-11
>> > [13] marray_1.26.0        RColorBrewer_1.0-2   RSQLite_0.9-1
>> > [16] simpleaffy_2.24.0    splines_2.11.1       stats4_2.11.1
>> > [19] survival_2.35-8      tools_2.11.1         vsn_3.16.0
>> > [22] xtable_1.5-6
>> >
>> > Thanks in advance,
>> > Edwin
>> > --
>> > Dr. Edwin Groot, postdoctoral associate
>> > AG Laux
>> > Institut fuer Biologie III
>> > Schaenzlestr. 1
>> > 79104 Freiburg, Deutschland
>> > +49 761-2032945
>> >
>> > _______________________________________________
>> > Bioconductor mailing list
>> > Bioconductor at stat.math.ethz.ch
>> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>> > Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >
>>
>>
>>
>> --
>> Audrey Kauffmann
>> Bergonie Cancer Institute
>> 229 Cours de l'Argonne
>> 33076 Bordeaux
>> France
>> +33.5.56.33.04.53
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



-- 
Audrey Kauffmann
Bergonie Cancer Institute
229 Cours de l'Argonne
33076 Bordeaux
France
+33.5.56.33.04.53



More information about the Bioconductor mailing list