[BioC] Beadarray - problem with BSData object created using 'summarize'

Mark Dunning mark.dunning at gmail.com
Tue Jan 11 17:19:47 CET 2011


Hi Kasla,

This problem was due to some recent functionality added to beadarray.
In short, the summarize function tries to be clever and work out which
sections should be combined and renames columns accordingly It seems
there was a bug that got the names confused when multiple chips are
present in the bead-level object.

For a simple fix for your data, if you try

> BSData <- summarize(BLDataCombo, list(greenChannel),useSampleFac = FALSE)

then it will not try this automatic grouping and naming of samples and
you should get the column names you expect. In future the bug will
have been fixed in the  devel and release versions of beadarray.

Best wishes,

Mark

On Fri, Jan 7, 2011 at 7:02 PM, Kasia Stepien <kasia at cmmt.ubc.ca> wrote:
> Hello!
>
> I am using beadarray v 2.0.2 to analyze beadlevel data for Illumina
> RatRef-12 whole genome gene expression arrays (I have 8 chips, with 12
> samples each, for a total of 96 arrays).
>
> When trying to create a bead summary object using "summarize", the
> section names from the BLData object appear to be recycled (eg.
> "5398636011_A", "5398636011_A.1", rather than "5398636011_A",
> "5398636011_B", etc). At first I thought the arrays themselves were
> being used reused, but the similarly named objects do not appear to be
> identical (see below).
>
> What could the reason be for this? Is there some argument for
> summarize that I can use to get around this problem?
>
> This is what it looks like for 2 chips, with 12 samples each:
>
>> BLData1 = readIllumina(dir="/home/kasia/kasiadata/5398636011filtered/", useImages=FALSE, illuminaAnnotation="Ratv1")
> Processing section 5398636011_A
> Processing section 5398636011_B
> Processing section 5398636011_C
> Processing section 5398636011_D
> Processing section 5398636011_E
> Processing section 5398636011_F
> Processing section 5398636011_G
> Processing section 5398636011_H
> Processing section 5398636011_I
> Processing section 5398636011_J
> Processing section 5398636011_K
> Processing section 5398636011_L
>> BLData2 = readIllumina(dir="/home/kasia/kasiadata/5398636033filtered/", useImages=FALSE, illuminaAnnotation="Ratv1")
> Processing section 5398636033_A
> Processing section 5398636033_B
> Processing section 5398636033_C
> Processing section 5398636033_D
> Processing section 5398636033_E
> Processing section 5398636033_F
> Processing section 5398636033_G
> Processing section 5398636033_H
> Processing section 5398636033_I
> Processing section 5398636033_J
> Processing section 5398636033_K
> Processing section 5398636033_L
>> BLDataCombo = combine(BLData1, BLData2)
>> is(BLDataCombo)
> [1] "beadLevelData"
>> sectionNames(BLDataCombo)
>  [1] "5398636011_A" "5398636011_B" "5398636011_C" "5398636011_D" "5398636011_E"
>  [6] "5398636011_F" "5398636011_G" "5398636011_H" "5398636011_I" "5398636011_J"
> [11] "5398636011_K" "5398636011_L" "5398636033_A" "5398636033_B" "5398636033_C"
> [16] "5398636033_D" "5398636033_E" "5398636033_F" "5398636033_G" "5398636033_H"
> [21] "5398636033_I" "5398636033_J" "5398636033_K" "5398636033_L"
>
>> myMean = function(x) mean(x, na.rm = TRUE)
>> mySd = function(x) sd(x, na.rm = TRUE)
>> greenChannel = new("illuminaChannel", logGreenChannelTransform, illuminaOutlierMethod, myMean, mySd, "G")
>> BSData <- summarize(BLDataCombo, list(greenChannel))
>> str(exprs(BSData))
>  num [1:23350, 1:24] 9.92 7.42 7.43 7.5 12.34 ...
>  - attr(*, "dimnames")=List of 2
>  ..$ : chr [1:23350] "ILMN_2039396" "ILMN_2040732" "ILMN_2039699"
> "ILMN_2038916" ...
>  ..$ : chr [1:24] "5398636011_A" "5398636033_B" "5398636011_C"
> "5398636033_D" ...
>
>> colnames(exprs(BSData))
>  [1] "5398636011_A"   "5398636033_B"   "5398636011_C"   "5398636033_D"
>  [5] "5398636011_E"   "5398636033_F"   "5398636011_G"   "5398636033_H"
>  [9] "5398636011_I"   "5398636033_J"   "5398636011_K"   "5398636033_L"
> [13] "5398636011_A.1" "5398636033_B.1" "5398636011_C.1" "5398636033_D.1"
> [17] "5398636011_E.1" "5398636033_F.1" "5398636011_G.1" "5398636033_H.1"
> [21] "5398636011_I.1" "5398636033_J.1" "5398636011_K.1" "5398636033_L.1"
>
>> identical(exprs(BSData)[,1],exprs(BSData)[,13])
> [1] FALSE
>
>> head(cbind(exprs(BSData)[,1],exprs(BSData)[,13]))
>                  [,1]      [,2]
> ILMN_2039396  9.917607  9.817626
> ILMN_2040732  7.415167  7.436922
> ILMN_2039699  7.432883  7.423043
> ILMN_2038916  7.504111  7.619327
> ILMN_1374916 12.342863 13.377692
> ILMN_1353986  7.210915  7.211393
>
>
>> sessionInfo()
> R version 2.12.0 (2010-10-15)
> Platform: x86_64-redhat-linux-gnu (64-bit)
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] illuminaRatv1BeadID.db_1.8.0 org.Rn.eg.db_2.4.6
> [3] RSQLite_0.9-2                DBI_0.2-5
> [5] AnnotationDbi_1.12.0         beadarray_2.0.2
> [7] Biobase_2.8.0
>
> loaded via a namespace (and not attached):
> [1] KernSmooth_2.23-4 limma_3.4.5       tools_2.12.1
>
>
>
> Thank you!
> Kasia
>
> --
> Kasia Stepien, M.Sc. Candidate
> Department of Medical Genetics
> University of British Columbia
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list