[R-sig-eco] Using residuals as dependent variables

Fri Jun 22 15:04:48 CEST 2012

Steve and all others who have made suggestions,

Thanks for this, I am busy reading suggested papers and investigating
various packages.

The reason I wasn't doing one big model was I am interested in why some
points don't "conform" to the oceanographic model - which in theory should
explain ~ 99% of the variance. My approach has been - this is the best model
I can make that should perform very well, it is not - so why not?

In essence I am trying to tease out what factors stop this model from
performing as it should in theory - the problem is the factors are likely to
be different for most observations.

I have made a big model (all oceanographic/socio etc variables) and I get a
good fit - AIC based selection - r-squared 82%, but this doesn't indicate
why the oceanographic model performed not so well- beyond saying it is due
to socio/economic factors. Furthermore, I would like to say socio factors
stop ... Australia.. from reaching the catch it theoretically could given
the productivity of the surrounding oceans. Whilst political reasons stop
Somalia from reaching the theoretical maximum catch - and this is where I am
struggling.

Chris

-----Original Message-----
From: Steve Brewer [mailto:jbrewer at olemiss.edu] 
Sent: 22 June 2012 13:48
To: Chris Mcowen; r-sig-ecology at r-project.org
Subject: Re: [R-sig-eco] Using residuals as dependent variables

Chris,

Another thing to keep in mind is that when you run the regression analysis
using residuals, as opposed to putting all predictors in the multiple
regression from the beginning (oceanographic data and productivity data),
you are in effect inflating the error df for the analysis of catch residuals
against productivity. In the multiple regression approach, one df is removed
from the error df for every predictor variable in the model.
When you run it as two separate analyses, as you propose, the df removed
from the error df in the first analysis (the one with oceanographic data)
are are put back in into the error df for the second analysis of catch
residuals vs productivity. This is usually not a big deal when the first
analysis contains only one or two predictors and lots of observations. But
when the reverse is true, you're more likely to get a significant
relationship between catch residuals and productivity even when none really
exists.

As others have suggested, why not put productivity and oceanographic data
together in a single mult reg model?

Hope this helps.

Steve

J. Stephen Brewer
Professor
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/ FAX - 662-915-5144
Phone - 662-915-1077

On 6/21/12 12:06 PM, "Chris Mcowen" <chrismcowen at gmail.com> wrote:

>Dear List,
>
> 
>
>I am wondering if the methodological approach I am taking is correct 
>and would be very grateful if people could comment and make 
>suggestions, much appreciated.
>
> 
>
>I have developed the best model ( AIC model selection) using 
>oceanographic data ( i.e. SST, chlorophyll, NPP...x6) to predict 
>reported fisheries catch for 52 countries.
>
> 
>
>I then extract the residuals from the model and anything positive has a 
>higher catch than would be predicted given the level of productivity in 
>the country, with the opposite being true.
>
> 
>
>What I want to do is:
>
> 
>
>1.       Regress a suite of ecological and socioeconomic variables against
>the residuals from the oceanographic model to determine which factors 
>cause some countries to be above and some below. I.E as trophic level 
>increase the residuals become increasingly negative.
>
>2.       Ideally ( and I am unsure how or if it is possible) work out for
>each country which variables/s cause the poor fit of that country to 
>the oceanographic model.
>
> 
>
>Thanks in advance for any suggestions / possible methods.
>
> 
>
>Chris
>
> 
>
>P.S - Below is the type of conclusions I am drawing
>
> 
>
>There are a number of reasons why some countries have higher / lower 
>catch than you would expect.
>
> 
>
>For example if the target fishery is a high trophic level species then 
>the link between primary productivity and catch will be lesser than if 
>the species was a lower trophic level ( transfer efficiency etc etc)- 
>resulting in a negative residual.  Alternatively it maybe that the area 
>is being overfished i.e. the north sea meaning more fish are being 
>caught in that region than it can sustain - resulting in a high 
>positive residual (as predicted by the model)
>
> 
>
>In reality it is likely a combination of this plus other, however some 
>factors will be relevant to others i.e. Somalia has a really low catch 
>compared to its productivity likely due to piracy and poor reporting of 
>statistics.
>
> 
>
> 
>
> 
>
>
>	[[alternative HTML version deleted]]
>
>_______________________________________________
>R-sig-ecology mailing list
>R-sig-ecology at r-project.org
>https://stat.ethz.ch/mailman/listinfo/r-sig-ecology