[BioC] read.ilmn function query
Gordon K Smyth
smyth at wehi.EDU.AU
Sun Jul 17 02:39:02 CEST 2011
Hi Natasha,
> Date: Fri, 15 Jul 2011 18:03:59 +0100
> From: Natasha Sahgal <nsahgal at well.ox.ac.uk>
> To: bioconductor at r-project.org
> Subject: [BioC] read.ilmn function query
>
> Dear List,
>
> Normally for Illumina arrays, instead of the functions given based in
> the limma user guide (e.g. neqc, read.ilmn etc.), I use:
>
> * read.delim - to load probe profile data and sample table control
> data respectively
> * perform bg correction using the negative control probes from the
> sample table control
> * filter data based on _"detection scores"_
> * normalise data using the _"vsn2"_ function
>
>
> However, as I have just realised that these can be used I have some queries:
>
> 1. Will there be much difference between the quantile normalisation
> in the neqc function (as compared to vsn2 ?)
The neqc() strategy is different from that of vsn, not only in terms of
normalization, but also in terms of background corection and variance
stabilization. The are some parallels however in the mathematical theory
between normexp background correction and the vsn transformation. How
different the practical results will be though, I don't know. We compared
neqc() to vst and other strategies that have been proposed for Illumina
BeadChip data in the literature, but vsn wasn't one of those.
> 2. How does one interpret the boxplots for the various controls
> (apart from x$genes$Status=="regular")?
> * as the median/mean vary a lot
> * much more for my samples (than the example shown in the user
> guide)
This is a property of your data. If the boxplots vary are lot, then there
must be a lot of variability in your data.
> 3. When filtering: based on the help of read.ilmn
> * The "Detection" column appears to be detection p-value by
> default
> * What does one do if the output is different from the
> GenomeStudio and it gives a "Detection Score" instead??
> o Would: expressed <- apply(y$other$Detection < 0.05,1,any)
> + change to: expressed <- apply(y$other$Detection
> > 0.95,1,any)
Yes.
> 4. Also, I do not fully understand the estimation of probes expressed
> using the propexpr function
> * one of my samples A7 shows 0.0 (I see that the housekeeping
> gene intensity for this is ~ 200 whereas for others its
> 1000+), its a similar case for samples A11 and A12
> o propexpr(x)
> o A1 A2 A7
> A8 A3 A4 A11 A12
> 0.3380243 0.4066500 0.0000000 0.4232871 0.3131936
> 0.3819055 0.1934197 0.2036340
> A5 A6 A9 A10
> 0.3363844 0.3476216 0.3445201 0.3834617
This seems to flag a possible problem with your sample A7. The regular
probes (the majority of them anyway) are no brighter than background
probes. This could suggest a problem with the RNA extraction, for
example, in this case. The proportion of expressed probes might not be
truly zero, but the spread of intensities must be different from that
usually seen for a good quality array.
Best wishes
Gordon
> sessionInfo()
> R version 2.13.0 (2011-04-13)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
> [5] LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8
> [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] gdata_2.8.2 limma_3.8.2
>
> loaded via a namespace (and not attached):
> [1] gtools_2.6.2 tools_2.13.0
>
> Many Thanks,
> Natasha
>
>
>
> --
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list