[R] nls: different results if applied to normal or linearized data

Thu Mar 6 09:24:20 CET 2008

On Thu, 6 Mar 2008, Wolfgang Waser wrote:

> Thanks for your comments!
>
>> Yes.  You are fitting by least-squares on two different scales:
>> differences in y and differences in log(y) are not comparable.
>>
>> Both are correct solutions to different problems.  Since we have no idea
>> what 'x' and 'y' are, we cannot even guess which is more appropriate in
>> your context.
>
> I'm fitting metabolic rate data from small fish (oxygen consumption in
> nmol/min vs. body weight in g).
> The b coefficient is the interesting part and is generally somewhere around
> 0.75.
> The one calculated for my data using option (a) is therefore 'better' than
> (b,c), but which one is the correct to use? Log-transformation of metabolic
> rate data is (was) normally performed to be able to determine a and b by
> simple linear regression (or even on paper).
>
>
>> The two approaches assume two different models.
>>
>>         Model (1) is y = a*x^b + E (where different errors are independent  
>> and identically
>>         --- usually normally --- distributed).
>>
>>         Model (2) is y = a*(x^b)*E (and you are usually tacitly assuming  
>> that ln E is normally distributed).
>>
>>         The point estimates of a and b will consequently be different ---  
>> although usually not hugely
>>         different.  Their distributional properties will be substantially  
>> different.
>
> So in view of my context (metabolic rate data) would Model (1) be the
> appropriate model to use?

Unlikely for a rate: those are normally viewed as being on log scale (we 
saya a rate is doubled, for example).  But a residual analysis will show 
if there are departures from assumptions in one or other model.

Usual advice: seek local statistical help, for these are conceptual and 
not R issues.

>
>
>>> Dear all,
>>>
>>> I did a non-linear least square model fit
>>>
>>> y ~ a * x^b
>>>
>>> (a) > nls(y ~ a * x^b, start=list(a=1,b=1))
>>>
>>> to obtain the coefficients a & b.
>>>
>>> I did the same with the linearized formula, including a linear model
>>>
>>> log(y) ~ log(a) + b * log(x)
>>>
>>> (b) > nls(log10(y) ~ log10(a) + b*log10(x), start=list(a=1,b=1))
>>> (c) > lm(log10(y) ~ log10(x))
>>>
>>> I expected coefficient b to be identical for all three cases. Hoever,
>>> using my dataset, coefficient b was:
>>> (a) 0.912
>>> (b) 0.9794
>>> (c) 0.9794
>>>
>>> Coefficient a also varied between option (a) and (b), 107.2 and 94.7,
>>> respectively.
>>>
>>> Is this supposed to happen? Which is the correct coefficient b?
>>>
>>> Regards,
>>>
>>> Wolfgang
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595