[BioC] Beadarray - problem with BSData object created using 'summarize'

Kasia Stepien kasia at cmmt.ubc.ca
Fri Jan 14 19:33:40 CET 2011


Hi Mark,

Thanks a lot! Nice that it is an easy fix. I also managed to get
around the problem by summarizing the BLData for each chip
independently, then combining them afterwards.

Another bug in the beadarray package that I found when running
summarize, specific to ratv1 arrays, is that summarize looks for the
file 'ratv1BeadLevelMapping.rda', which does not exist in the library.
However, the file "ratBeadLevelMapping.rda" does, so we made a copy,
renamed it, and saved it in the directory, which seemed to solve the
problem temporarily.

> BSData <- summarize(BLDataBsh, list(greenChannel), useSampleFac = FALSE)
No sample factor specified. Summarizing each section separately
Finding list of unique probes in beadLevelData
23401  unique probeIDs found
Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
In addition: Warning message:
In readChar(con, 5L, useBytes = TRUE) :
  cannot open compressed file
'/usr/lib64/R/library/beadarray/extdata/ratv1BeadLevelMapping.rda',
probable reason 'No such file or directory'
> load("/usr/lib64/R/library/beadarray/extdata/ratBeadLevelMapping.rda")
> ratv1BeadLevelMapping<-ratBeadLevelMapping
> save(ratv1BeadLevelMapping, file="/usr/lib64/R/library/beadarray/extdata/ratv1BeadLevelMapping.rda")

Cheers,
Kasia

On Tue, Jan 11, 2011 at 8:19 AM, Mark Dunning <mark.dunning at gmail.com> wrote:
> Hi Kasla,
>
> This problem was due to some recent functionality added to beadarray.
> In short, the summarize function tries to be clever and work out which
> sections should be combined and renames columns accordingly It seems
> there was a bug that got the names confused when multiple chips are
> present in the bead-level object.
>
> For a simple fix for your data, if you try
>
>> BSData <- summarize(BLDataCombo, list(greenChannel),useSampleFac = FALSE)
>
> then it will not try this automatic grouping and naming of samples and
> you should get the column names you expect. In future the bug will
> have been fixed in the  devel and release versions of beadarray.
>
> Best wishes,
>
> Mark
>
> On Fri, Jan 7, 2011 at 7:02 PM, Kasia Stepien <kasia at cmmt.ubc.ca> wrote:
>> Hello!
>>
>> I am using beadarray v 2.0.2 to analyze beadlevel data for Illumina
>> RatRef-12 whole genome gene expression arrays (I have 8 chips, with 12
>> samples each, for a total of 96 arrays).
>>
>> When trying to create a bead summary object using "summarize", the
>> section names from the BLData object appear to be recycled (eg.
>> "5398636011_A", "5398636011_A.1", rather than "5398636011_A",
>> "5398636011_B", etc). At first I thought the arrays themselves were
>> being used reused, but the similarly named objects do not appear to be
>> identical (see below).
>>
>> What could the reason be for this? Is there some argument for
>> summarize that I can use to get around this problem?
>>
>> This is what it looks like for 2 chips, with 12 samples each:
>>
>>> BLData1 = readIllumina(dir="/home/kasia/kasiadata/5398636011filtered/", useImages=FALSE, illuminaAnnotation="Ratv1")
>> Processing section 5398636011_A
>> Processing section 5398636011_B
>> Processing section 5398636011_C
>> Processing section 5398636011_D
>> Processing section 5398636011_E
>> Processing section 5398636011_F
>> Processing section 5398636011_G
>> Processing section 5398636011_H
>> Processing section 5398636011_I
>> Processing section 5398636011_J
>> Processing section 5398636011_K
>> Processing section 5398636011_L
>>> BLData2 = readIllumina(dir="/home/kasia/kasiadata/5398636033filtered/", useImages=FALSE, illuminaAnnotation="Ratv1")
>> Processing section 5398636033_A
>> Processing section 5398636033_B
>> Processing section 5398636033_C
>> Processing section 5398636033_D
>> Processing section 5398636033_E
>> Processing section 5398636033_F
>> Processing section 5398636033_G
>> Processing section 5398636033_H
>> Processing section 5398636033_I
>> Processing section 5398636033_J
>> Processing section 5398636033_K
>> Processing section 5398636033_L
>>> BLDataCombo = combine(BLData1, BLData2)
>>> is(BLDataCombo)
>> [1] "beadLevelData"
>>> sectionNames(BLDataCombo)
>>  [1] "5398636011_A" "5398636011_B" "5398636011_C" "5398636011_D" "5398636011_E"
>>  [6] "5398636011_F" "5398636011_G" "5398636011_H" "5398636011_I" "5398636011_J"
>> [11] "5398636011_K" "5398636011_L" "5398636033_A" "5398636033_B" "5398636033_C"
>> [16] "5398636033_D" "5398636033_E" "5398636033_F" "5398636033_G" "5398636033_H"
>> [21] "5398636033_I" "5398636033_J" "5398636033_K" "5398636033_L"
>>
>>> myMean = function(x) mean(x, na.rm = TRUE)
>>> mySd = function(x) sd(x, na.rm = TRUE)
>>> greenChannel = new("illuminaChannel", logGreenChannelTransform, illuminaOutlierMethod, myMean, mySd, "G")
>>> BSData <- summarize(BLDataCombo, list(greenChannel))
>>> str(exprs(BSData))
>>  num [1:23350, 1:24] 9.92 7.42 7.43 7.5 12.34 ...
>>  - attr(*, "dimnames")=List of 2
>>  ..$ : chr [1:23350] "ILMN_2039396" "ILMN_2040732" "ILMN_2039699"
>> "ILMN_2038916" ...
>>  ..$ : chr [1:24] "5398636011_A" "5398636033_B" "5398636011_C"
>> "5398636033_D" ...
>>
>>> colnames(exprs(BSData))
>>  [1] "5398636011_A"   "5398636033_B"   "5398636011_C"   "5398636033_D"
>>  [5] "5398636011_E"   "5398636033_F"   "5398636011_G"   "5398636033_H"
>>  [9] "5398636011_I"   "5398636033_J"   "5398636011_K"   "5398636033_L"
>> [13] "5398636011_A.1" "5398636033_B.1" "5398636011_C.1" "5398636033_D.1"
>> [17] "5398636011_E.1" "5398636033_F.1" "5398636011_G.1" "5398636033_H.1"
>> [21] "5398636011_I.1" "5398636033_J.1" "5398636011_K.1" "5398636033_L.1"
>>
>>> identical(exprs(BSData)[,1],exprs(BSData)[,13])
>> [1] FALSE
>>
>>> head(cbind(exprs(BSData)[,1],exprs(BSData)[,13]))
>>                  [,1]      [,2]
>> ILMN_2039396  9.917607  9.817626
>> ILMN_2040732  7.415167  7.436922
>> ILMN_2039699  7.432883  7.423043
>> ILMN_2038916  7.504111  7.619327
>> ILMN_1374916 12.342863 13.377692
>> ILMN_1353986  7.210915  7.211393
>>
>>
>>> sessionInfo()
>> R version 2.12.0 (2010-10-15)
>> Platform: x86_64-redhat-linux-gnu (64-bit)
>>
>> locale:
>>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] illuminaRatv1BeadID.db_1.8.0 org.Rn.eg.db_2.4.6
>> [3] RSQLite_0.9-2                DBI_0.2-5
>> [5] AnnotationDbi_1.12.0         beadarray_2.0.2
>> [7] Biobase_2.8.0
>>
>> loaded via a namespace (and not attached):
>> [1] KernSmooth_2.23-4 limma_3.4.5       tools_2.12.1
>>
>>
>>
>> Thank you!
>> Kasia
>>
>> --
>> Kasia Stepien, M.Sc. Candidate
>> Department of Medical Genetics
>> University of British Columbia
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>



-- 
Kasia Stepien, M.Sc. Candidate
Department of Medical Genetics
University of British Columbia



More information about the Bioconductor mailing list