[R] Calculating with tolerances (error propagation)

Fri Sep 10 15:37:00 CEST 2010

On Sep 9, 2010, at 10:57 AM, David Winsemius wrote:

>
> On Sep 9, 2010, at 6:50 AM, Jan private wrote:
>
>> Hello Bernardo,
>>
>> ---------
>> If I understood  your problem this script solve your problem:
>>
>> q<-0.15 + c(-.1,0,.1)
>> h<-10 + c(-.1,0,.1)
>> 5*q*h
>> [1]  2.475  7.500 12.625
>> ---------
>>
>> OK, this solves the simple example.
>> But what if the example is not that simple. E.g.
>>
>> P = 5 * q/h
>>
>> Here, to get the maximum tolerances for P, we need to divide the  
>> maximum
>> value for q by the minimum value for h, and vice versa.
>> Is there any way
>> to do this automatically, without thinking about every single step?
>>
>> There is a thing called interval arithmetic (I saw it as an Octave
>> package) which would do something like this.
>
> (Sorry for the blank reply posting. Serum caffeine has not yet  
> reached optimal levels.)
>
> Is it possible that interval arithmetic would produce statistically  
> incorrect tolerance calculation, and that be why it has not been  
> added to R? Those tolerance intervals are presumably some sort of  
> (unspecified) prediction intervals (i.e. contain 95% or 63% or some  
> fraction of a large sample) and combinations under mathematical  
> operations are not going to be properly derived by c( min(XY),  
> max(XY) ) since those are not calculated with any understanding of  
> combining variances of functions on random variables.

There is a function, propagate, in the qpcR package that does  
incorporate statistical principles in handling error propagation.  
Thanks to the author, Dr. rer. nat. Andrej-Nikolai Spiess, for drawing  
it to my attention (and of course for writing it.). It appears that it  
should handle the data situation offered with only minor  
modifications. The first example is vary similar to your (more  
difficult) ratio problem.
 > install.packages(pkgs="qpcR", type="source")
# binary install did not succeed on my Mac, but installing from source  
produced no errors or warnings.
 > require(qpcR)
 > q<- c( 0.15 , .1)
 > h<-c( 10  , .1)
 > EXPR <- expression(5*q/h)
 > DF <- cbind(q, h)
 > res <- propagate(expr = EXPR, data = DF, type = "stat",
+                  do.sim = TRUE, verbose = TRUE)
 > res$summary
                    Sim Perm        Prop
Mean        0.07500751  NaN  0.07500000
s.d.        0.05001970   NA  0.05000562
Median      0.07445724   NA          NA
MAD         0.04922935   NA          NA
Conf.lower -0.02332498   NA -0.02300922
Conf.upper  0.17475818   NA  0.17300922

(My only suggestion for enhancement would be a print or summary method  
that did not output every single simulated value.)

Three methods for error propagation estimation are included: a) Monte- 
Carlo simulation using sampling from the data, b) Permutation, and c)  
Gaussian errors calculated via a Taylor series expansion. There is a  
rich set of worked examples. It appears to be capable of meeting a  
wide variety of challenges.

-- 
David.

>
> -- 
> David.
>>
>> I would have thought that tracking how a (measuring) error propagates
>> through a complex calculation would be a standard problem of
>> statistics??
>
> In probability theory, anyway.
>
>> In other words, I am looking for a data type which is a
>> number with a deviation +- somehow attached to it, with binary  
>> operators
>> that automatically knows how to handle the deviation.
>
> There is the suite of packages that represent theoretic random  
> variables and support mathematical operations on them.
>
> See distrDoc and the rest of that suite.
>

David Winsemius, MD
West Hartford, CT