[R] Numerical precision of hist densities or cumsum?

Paul E Johnson pauljohn at ukans.edu
Thu Oct 18 18:30:33 CEST 2001

Today I wanted to experiment with different distributions an to see the
hazard rates they imply. So I eventually ended up with this, which uses
the hist object's handy $density and the cumsum function in R:

x <- c(rweibull(21000,0.5,0.7))
#create "breaks" vector to go into histogram
#need last break bigger than max(x)
y <- seq(0,max(x)+2)
histx <- hist(x,freq=FALSE,breaks=y)
cumProb<- cumsum(histx$density)
survival <- 1-cumProb
survival<- c(1,survival)  
survival <- survival[1:length(histx$density)]
hazard <- histx$density / survival
retention<- 1-hazard

This works the way I expect, except that there is a little numerical
imprecision in the high end of the values for the cumulative
probability.  The vector cumProb does not have 1.0 as its last element,
but rather something close, like 0.99997 or such. I know the values in
cumProb are correct for the left end of the vector, just not the right
end.  That has implications for the calculation of hazard and retention
curve values.

Do you have any tips for me? 
Paul E. Johnson                       email: pauljohn at ukans.edu
Dept. of Political Science            http://lark.cc.ukans.edu/~pauljohn
University of Kansas                  Office: (785) 864-9086
Lawrence, Kansas 66045                FAX: (785) 864-5700
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list