[BioC] summarized expression values from beadarray versus GenomeStudio
Ina Hoeschele
inah at vbi.vt.edu
Tue Apr 12 16:22:45 CEST 2011
Hi Mark and Wei,
thank you very much for your suggestions.
For all of my 8 BSData objects the first dimension is 48,107 probes (47,224 gene probes, 883 control probes). The corresponding dataset produced by GenomeStudio contains 47,320 gene probes and 886 control probes, so I seem to have 96 fewer gene probes and 3 control probes less ... I do not know why there is this difference, but these numbers do not look like anything is really messed up.
I would not be so worried about the discrepancy in values, but since the correlations among (control) samples (on different chips) are so much worse for Bioconductor compared to GenomeStudio (.91-.92 versus .98-.99), something must be going wrong somewhere.
Related to this, for each sample run on a bead chip, there may be some bead types that failed. For all samples that are combined in a 'project' in GenomeStudio, bead types that have failed in any of these samples are excluded from the summarized data (unless one checks the impute option). I wonder how this is being handled in the summarization in beadarray. Since beadarray deals with a single chip at a time, a project in beadarrary would be a single chip. So if beadarray also excludes failed bead types, then different BSData objects (each representing a single chip) may have different bead types represented. I need to check whether this might have messed up my correlations between control samples from different chips (?) But for my first batch of 8 chips, all BSData objects have the same 1st dimension, which is a bit smaller than the number of summarized probes from GenomeStudio.
Ina
----- Original Message -----
From: "Mark Dunning" <mark.dunning at gmail.com>
To: "Ina Hoeschele" <inah at vbi.vt.edu>
Cc: bioconductor at stat.math.ethz.ch
Sent: Thursday, April 7, 2011 5:33:09 AM
Subject: Re: summarized expression values from beadarray versus GenomeStudio
Hi Ina,
Nothing seems to be wrong with your approach and it should re-create
the BeadStudio intensities. We tried it out on some of our own data
and managed to get very close to the BeadStudio values.
Do the number of observations reported by beadarray and GenomeStudio
agree? What are the dimensions of your BSData object and are they what
you are expecting? It could be that summarize is incorrectly trying to
combine data from multiple strips.
Best,
Mark
On Mon, Apr 4, 2011 at 11:13 PM, Ina Hoeschele <inah at vbi.vt.edu> wrote:
> Hi Mark et al.,
> I have calculated correlations among the expression vectors of different samples (in particular for a control sample that we use on each BeadChip), both for the expression data that I have processed in Bioconductor using the beadarray package and for the expression data produced by GenomeStudio (selecting quantile normalization). The correlations (especially for the control samples from different chips) are clearly worse for the Bioconductor processed data and I have been trying to track down where I have a problem.
>
> I also have the summarized (bead-type) intensities from GenomeStudio without normalization. I obtain the corresponding summarized values from beadarray with the following code
>
> myMean = function(x) mean(x, na.rm = TRUE)
> mySe = function(x) sd(x, na.rm = TRUE)/sqrt(length(x))
> GreenChannelTransform <- function (BLData, array)
> {
> x = getBeadData(BLData, array = array, what = "Grn")
> return(x)
> }
> greenChannel = new("illuminaChannel",GreenChannelTransform,illuminaOutlierMethod,myMean,mySe,"G")
>
> for (iChip in 1:nChips)
> {
> setwd(Chip.Dir[iChip])
> BLData = readIllumina(useImages=FALSE, illuminaAnnotation="Humanv4")
> BSData <- summarize(BLData,list(greenChannel),useSampleFac=TRUE,sampleFac=NULL,removeUnMappedProbes=TRUE)
> save(BSData,file="BSData.rda")
> rm(BLData); rm(BSData); gc()
> }
>
>
> If the data are summarized in this way using Bioconductor/beadarray, would you not expect the summarized values to be identical to those from GenomeStudio?
>
> I checked the summarized value for one beadtype on the first several sections of chip 1.
> The summary values from GenomeStudio are: 77.93, 159.16, 174.93, 131.05, 484.39
> The summary values from beadarray are: 90.0, 192.0, 1q88.5, 157.0, 492.0
> (I also calculated the first summary value by hand and come up with 103.36!)
>
> Why are these values different, any hint?
>
> Many thanks as always, Ina
>
More information about the Bioconductor
mailing list