[R] nls newbie: help approximating Weibull distribution

Douglas Bates bates at stat.wisc.edu
Mon Jul 2 16:53:06 CEST 2001


"Dr. Mirko Luedde" <Mirko.Luedde at computer.org> writes:

> Hi folks, 
> 
> I tried to retain the Weibull distribution using the `nls' function
> and proceeding along the lines of the example provided in the
> `SSweibull' help (at least I thought so):
> 
> 	t <- (1:200)/100
> 	v <- pweibull(t, shape=3, scale=1)
> 	df <- data.frame(Time=t, Value=v)
> 	Asym <- 1.0; Drop <- 1.0; lrc <- 0; pwr <- 1
> 	df.estimate <- nls(Value ~ SSweibull(Time, Asym, Drop, lrc, pwr), data=df)
> 
> But this yields the error message "Error in qr.solve(QR.B, cc) :
> singular matrix `a' in solve".
> 
> Any hints? 

This error is occuring during the calculation of the initial
estimates for the nonlinear least squares fit.  You have an exact
fit.  The convergence criterion used by nls is not appropriate for
such artificial data.  Use traceback() to find where the error occurs.

The larger question is why are you doing this in the first place?  You
have asked on this list about using SSweibull to fit the parameters in
a Weibull probability distribution.  I have responded to you privately
on at least three occasions.  Others have responded to you with copies
to the list.  All of these responses have informed you that SSweibull
is used with nls to fit the Weibull growth curve model.  This is
related to the Weibull probability distribution but it is not the
same.  Using SSweibull and nls is not an appropriate way of fitting
the parameters in the Weibull probability distribution.

For some reason you choose to ignore these responses and continue to
ask on this list about methods of using SSweibull and nls to fit the
parameters in the Weibull probability distribution.  I don't see that
there is much point in me or anyone else responding to you when you
ignore everything we say but I will try one more time.

It is not a good idea use nls to estimate parameters in a probability
distribution.  The nonlinear least squares criterion is appropriate
for observations in which the noise or error term is independent with
constant variance.  Apparently you are planning to fit a curve to
something like an empirical cumulative distribution function, although
in your example you are using an exact cumulative distribution
function.  The assumptions of constant variance and independence are
not even close to being satisfied.  Furthermore, it is hideously
inefficient to fit a four parameter curve to an empirical cumulative
distribution function when you know the values of two of those
parameters.

So, for at least the fourth time, I will tell you that you should not
use nls and SSweibull to fit the parameters in a Weibull probability
distribution.

A much more reasonable approach would be to use maximum likelihood.
Let me illustrate this.  The data from example 4.30 in Jay Devore's
book "Probability and Statistics for Engineering and the Sciences" is
the "lifetimes (hr) of power apparatus insulation under thermal and
electrical stress."

> data(xmp04.30, package = Devore5)
> xmp04.30$lifetime
 [1]  282  501  741  851 1072 1122 1202 1585 1905 2138
> stem(xmp04.30$lifetime)   # a stem-and-leaf plot

  The decimal point is 3 digit(s) to the right of the |

  0 | 3
  0 | 579
  1 | 112
  1 | 69
  2 | 1

The likelihood for the parameters "shape" and "scale" in the Weibull
distribution is the product of the probability densities evaluated at
the observed data and those parameters.  Generally we optimize the
log-likelihood as it tends to be closer to being a quadratic function
of the parameters.  The log-likelihood is the sum of the
log-densities.  To optimize this we use optim, which requires starting
estimates for the parameters.  For starting values we could use shape
= 1 and scale = 1000 for these data.

By default, optim will minimize a function so we define our function
to be the negative of the log-likelihood.

> optim(c(shape = 1, scale = 1000),
        function(x) -sum(dweibull(xmp04.30$lifetime, shape = x[1],
                                  scale = x[2], log = TRUE)))
$par
      shape       scale 
   2.151617 1289.216504 
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list