[BioC] vsn2 and print-tips

Mon Mar 10 20:15:27 CET 2008

Dear Hans-Ulrich et al.,

vsn >= 3.4.11. now computes separate offsets for different strate (such
as e.g. print-tip groups), as suggested by Hans-Ulrich earlier in this
thread. It is available at
http://www.bioconductor.org/packages/2.2/bioc/html/vsn.html

Best wishes
 Wolfgang

------------------------------------------------------------------
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber

04/03/2008 16:52 Wolfgang Huber a écrit
> Dear Hans-Ulrich,
> 
> thank you for your thoughtful message! The executive summary: this is
> indeed an (unintended) difference between vsn and vsn2, and I will
> update vsn2 before the next release. It only affects applications with
> multiple strata (print-tip groups).
> 
> Bacgkround: the error and normalisation model of vsn is invariant under
> an overall scaling of the data: if you multiply all intensities by a
> factor of 10, you will get the same output - except for an overall shift
> on the glog2 scale of log2(10). This makes sense because microarray data
> don't have units and a value of "200" can mean very different things say
> on an Affymetrix genechip and on a custom-made array.
> 
> This explains why there is this 'arbitrary' offset c. It is computed
> through an explicite formula from the b's  (i.e. the scale factors),
> hence the fact whether your actual data contain instances of large x
> does not directly matter (it may indirectly, by affecting how the b's
> are estimated). For x -> infinity, the function glog2(f(b)*x+a)
> approaches log2(x) + log2(f(b)) + log2(2), and c is computed to cancel
> out the last two terms, so that for large x, the net transformation
> resembles log2(x). There is one b for each array and stratum (=print tip
> group). The current implementation of vsn2 computes one single value c
> by taking the mean of log2(f(b)) + log2(2) across all strata and arrays.
> The old vsn computed c from the b's of the first array only, but
> separately for each stratum.
> 
> I had not anticipated that the difference between strata could make such
> a difference, but given your observations, and with more thought about
> it, it does make sense. I will update vsn2 to compute c from averaging
> over the arrays, but separately for each stratum.
> 
> Best wishes
>  Wolfgang
> 
> ------------------------------------------------------------------
> Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber
> 
> 
> 04/03/2008 16:38 Hans-Ulrich Klein scripsit
>> Dear all,
>>
>> I use the vsn2 method to normalize single-colour arrays with 48 
>> print-tips (25*26 oligos per print-tip). After normalization, the 
>> intensities of the 48 print-tips are in different ranges. The grid of 
>> the print-tips can be seen clearly on false color representations of the 
>> arrays' spatial distributions of feature intensities. However, scale and 
>> location of the intensities of a print-tip do not change across arrays.
>>
>> The man page of vsn2 says:
>> "The data are returned on a glog scale to base 2. More precisely,
>> the transformed data are subject to the transformation
>> glog2(f(b)*x+a) + c, where glog2(u) = log2(u+sqrt(u*u+1)) =
>> asinh(u)/log(2) is called the generalised logarithm, a and b are
>> the fitted model parameters (see references), f is a parameter
>> transformation [4], and the overall constant offset c is computed
>> from b such that for large x the transformation approximately
>> corresponds to the log2 function."
>>
>> May be there are not enough "large x" in some print-tips due to missing 
>> values in my data. I observed that reducing the number of oligos leads 
>> to even larger differences in the print-tip offsets. Are there 
>> parameters to take influence on the computation of c? Has someone else 
>> observed this problem? The older "vsn" function does not lead to 
>> different print-tip offsets.
>>
>> Regards,
>> Hans-Ulrich
>>
>>
>>
>>
>>  > sessionInfo()
>> R version 2.6.2 (2008-02-08)
>> x86_64-pc-linux-gnu
>>
>> locale:
>> C
>>
>> attached base packages:
>> [1] tools     stats     graphics  grDevices utils     datasets  methods 
>> [8] base    
>>
>> other attached packages:
>> [1] vsn_3.2.1            limma_2.12.0         affy_1.16.0        
>> [4] preprocessCore_1.0.0 affyio_1.6.1         Biobase_1.16.3     
>>
>> loaded via a namespace (and not attached):
>> [1] grid_2.6.2      lattice_0.17-4  rcompgen_0.1-17
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor