[R] geometric mean regression

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Mon Jun 6 11:48:36 CEST 2005


On 03-Jun-05 Michael Grant wrote:
> 
> I presume the reference is to the 'geometric mean
> functional regression' or the 'line of organic
> correlation' or 'reduced major axis regression'.  If
> so, this is relatively easy alsmost trivial to
> implement in R.

This somewhat contentious method is indeed trivial to
implement in R. The idea is that if you plot the two
regression lines (y on x, x on y) on the same axes
(y vertical, x horizontal), the slope of the GMR is
the geometric mean of the slopes of these two lines.

Since the slope of the y-on-x line is Sxy/Sxx, and
the slope of the x-on-y line is Syy/Sxy, the GMR slope
is therefore sqrt(Syy/Sxx) = sd(y)/sd(x).

All three lines go through the same point, (mean(x),mean(y)).

> Maybe it's in a package, but I never looked.

It hardly needs a package!

> I worked from Helsel's description in his classic water
> resources statistics book. See Chapter 10 here: 
> 
> http://water.usgs.gov/pubs/twri/twri4a3/

The method goes back a lot further than suggested here.
It seems it was proposed in oceanography by H. Sverdrup
in 1916, and very influentially promoted by W.E. Ricker
(e.g. Jnl Fisheries Research Board of Canada, 1973,
vol. 30, 409-434).

> Now, if you are after confidence intervals or
> prediction intervals, I haven't found anything on that
> yet. Seems that I did something a couple of year ago
> by hacking some approximate residuals using the LOC
> line and the data, and then feeding that into the CL
> and PL equations for OLS. (Be advised that I'm not a
> statistician and did that in the spirit of 
> approximation--who knows? :O) )

The uncertainty properties, and indeed the interpretation,
of this method are elusive. You can, of course, resort to
whatever stochastic modelling you choose (including simulation
and bootstrap) to estimate the variability of the slope
sd(y)/sd(x) and of any predictions you may want to make.

However, the method shows its indeterminate side to the
extent that the relationship between y and x is loose rather
than tight.

At one extreme, where the correlation between x and y = 1,
the two regression lines (y on x and z on y) and the GMR
all coincide. No problem here.

At the other extreme, where there is no correlation, the
GMR method still gives you a definite answer (sd(y)/sd(x))
even though by normal standards there is no relationhip
between y and x. In the latter case, the slope of the
GMR depends solely on the two SDs, and we may well ask
what is being estimated here (apart from the ratio of
the SDs).

(Of course, if you go back to the "primitive" definition, you
find yourself evaluating sqrt(0 * inf), which is indeterminate;
and this is a better outcome than sd(y)/sd(x), but still falls
short of telling you directly that y is independent of x).

As you approach the r=0 situation, you therefore have to be
mindful that the GMR method will appear to provide a definite
answer to a question which in reality has at best a vague
answer, i.e. there is a major problem of interpretation.

Therefore I would be suspicious of results obtained by "blind"
application of the GMR method which were not accompanied by
a good discussion of grounds why the results can be expectd
to be meaningful in the particular case where it has been applied.

The GMR method seems to be well entrenched in the fisheries,
natural resources, and ecology worlds. I suspect that the reasons
for this may be partly "psychological": people are aware that
they are looking for a functional relationship, are put off
(rightly) by the existence of two regression lines, and are
not enthusiastic to tangle with the difficulties (including
the potential indeterminacy) of estimating a linear functional
relationship. The GMR provides a very simple escape route
which, in no doubt many cases, may give you as good a working
answer as you can expect.

Nevertheless, I'm inclined to the view that the linear functional
relationship is usually the best way to go. When the observed
(x,y) points depart from the "true" points on the straight line
by normally distributed amounts, the MLE of the relationship
is well defined provided the ratio of the "departure" variances
is fixed. Therefore it is possible to examine the robustness
of the estimated relationship with respect to variation in the
assumed value of this ratio. To the extent that this is 
acceptably robust within plausible variation of the ratio,
you have an adequate and reliable perspective. Otherwise,
you have to acknowledge that your information is inadquate.

The danger of adopting a formulaic solution like GMY is that
it tends to conceal inadequacy of information!

Best wishes,
Ted.

> By coincidence I've been looking at this again
> recently. Maybe bootstrapping....
> 
> Regards,
> Michael Grant
> 
> --- Kjetil Brinchmann Halvorsen <kjetil at acelerate.com>
> wrote:
> 
>> Poizot Emmanuel wrote:
>> 
>> > Hi,
>> >
>> > is it possible to perform a geometric mean
>> regression with R ?
>> > Thanks.
>> >
>> As has been said on this list before, "This is R,
>> there is no if, only 
>> how",
>> 
>> but if you actually wanted to ask how it is
>> possible, it would help if
>> you explained what is "geometric mean regression".
>> 
>> Kjetil
>> 
>> > ------------------------------------------------
>> > Emmanuel Poizot
>> > Cnam/Intechmer
>> > B.P. 324
>> > 50103 Cherbourg Cedex
>> >
>> > Phone (Direct) : (00 33)(0)233887342
>> > Fax : (00 33)(0)233887339
>> > ------------------------------------------------
>> >
>>
>>------------------------------------------------------------------------
>> >
>> >______________________________________________
>> >R-help at stat.math.ethz.ch mailing list
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide!
>> http://www.R-project.org/posting-guide.html
>> >
>>
>>------------------------------------------------------------------------
>> >
>> >No virus found in this incoming message.
>> >Checked by AVG Anti-Virus.
>> >Version: 7.0.322 / Virus Database: 267.4.0 -
>> Release Date: 01/06/2005
>> >  
>> >
>> 
>> 
>> -- 
>> 
>> Kjetil Halvorsen.
>> 
>> Peace is the most effective weapon of mass
>> construction.
>>                --  Mahdi Elmandjra
>> 
>> 
>> 
>> 
>> -- 
>> No virus found in this outgoing message.
>> Checked by AVG Anti-Virus.
>> 
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide!
>> http://www.R-project.org/posting-guide.html
>>
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 06-Jun-05                                       Time: 10:20:01
------------------------------ XFMail ------------------------------




More information about the R-help mailing list