[BioC] read.ilmn function query

Wei Shi shi at wehi.EDU.AU
Sun Jul 17 08:27:27 CEST 2011


Hi Natasha,

	Just adding to Gordon's reply: the "detection" columns in the read.ilmn output are always the same with those in the GenomeStudio/BeadStudio output. read.ilmn function does not change the original detection p values or detection scores.

Cheers,
Wei


On Jul 17, 2011, at 10:39 AM, Gordon K Smyth wrote:

> Hi Natasha,
> 
>> Date: Fri, 15 Jul 2011 18:03:59 +0100
>> From: Natasha Sahgal <nsahgal at well.ox.ac.uk>
>> To: bioconductor at r-project.org
>> Subject: [BioC] read.ilmn function query
>> 
>> Dear List,
>> 
>> Normally for Illumina arrays, instead of the functions given based in
>> the limma user guide (e.g. neqc, read.ilmn etc.), I use:
>> 
>>   * read.delim - to load probe profile data and sample table control
>>     data respectively
>>   * perform bg correction using the negative control probes from the
>>     sample table control
>>   * filter data based on _"detection scores"_
>>   * normalise data using the _"vsn2"_ function
>> 
>> 
>> However, as I have just realised that these can be used I have some queries:
>> 
>>  1. Will there be much difference between the quantile normalisation
>>     in the neqc function (as compared to vsn2 ?)
> 
> The neqc() strategy is different from that of vsn, not only in terms of normalization, but also in terms of background corection and variance stabilization.  The are some parallels however in the mathematical theory between normexp background correction and the vsn transformation.  How different the practical results will be though, I don't know.  We compared neqc() to vst and other strategies that have been proposed for Illumina BeadChip data in the literature, but vsn wasn't one of those.
> 
>>  2. How does one interpret the boxplots for the various controls
>>     (apart from x$genes$Status=="regular")?
>>         * as the median/mean vary a lot
>>         * much more for my samples (than the example shown in the user
>>           guide)
> 
> This is a property of your data.  If the boxplots vary are lot, then there must be a lot of variability in your data.
> 
>>  3. When filtering: based on the help of read.ilmn
>>         * The "Detection" column appears to be detection p-value by
>>           default
>>         * What does one do if the output is different from the
>>           GenomeStudio and it gives a "Detection Score" instead??
>>               o Would: expressed <- apply(y$other$Detection < 0.05,1,any)
>>                     + change to: expressed <- apply(y$other$Detection
>>                        > 0.95,1,any)
> 
> Yes.
> 
>>  4. Also, I do not fully understand the estimation of probes expressed
>>     using the propexpr function
>>         * one of my samples A7 shows 0.0 (I see that the housekeeping
>>           gene intensity for this is ~ 200 whereas for others its
>>           1000+), its a similar case for samples A11 and A12
>>               o propexpr(x)
>>               o             A1           A2             A7
>>                 A8             A3            A4          A11          A12
>>                 0.3380243 0.4066500 0.0000000 0.4232871 0.3131936
>>                 0.3819055 0.1934197 0.2036340
>>                             A5            A6            A9          A10
>>                 0.3363844 0.3476216 0.3445201 0.3834617
> 
> This seems to flag a possible problem with your sample A7.  The regular probes (the majority of them anyway) are no brighter than background probes.  This could suggest a problem with the RNA extraction, for example, in this case.  The proportion of expressed probes might not be truly zero, but the spread of intensities must be different from that usually seen for a good quality array.
> 
> Best wishes
> Gordon
> 
>> sessionInfo()
>> R version 2.13.0 (2011-04-13)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>> 
>> locale:
>> [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
>> [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
>> [5] LC_MONETARY=C              LC_MESSAGES=en_GB.UTF-8
>> [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
>> [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>> 
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>> 
>> other attached packages:
>> [1] gdata_2.8.2 limma_3.8.2
>> 
>> loaded via a namespace (and not attached):
>> [1] gtools_2.6.2 tools_2.13.0
>> 
>> Many Thanks,
>> Natasha
>> 
>> 
>> 
>> --


______________________________________________________________________
The information in this email is confidential and intend...{{dropped:6}}



More information about the Bioconductor mailing list