[R-sig-eco] Factors in partial RDA part 2
Andrew Halford
andrew.halford at gmail.com
Wed Feb 8 09:14:42 CET 2017
Hi Listers,
Further to my last post I am seeking more insights into how to interpret
effects of Factors in a RDA analysis.
I think of a factor as a single variable whose influence on observed fish
distribution patterns I would like to quantify, along with a bunch of other
numerical variables.
To do the analyses this Factor (called 'geom') is turned into a number of
dummy variables (seven actually).
The conceptual problem I am having is that when I do a call to vif.cca to
check on collinearity for example, the output suggests I should remove some
of the dummy variables making up the levels of my Factor. Doing this would
leave me with only 3 of the dummy variables out of the original 8 to put
into the RDA. I then don't see that I am actually testing the Factor 'geom'
anymore but rather just individual variables representing a couple of the
different levels of the original Factor. How do I proceed with this?
The same conundrum for me is seen when I run the forward.sel command to
look at the most efficient number of explanatory variables to have in the
final model. The process selects only some of the dummy variables to
include in the model. Again I struggle to see how I am testing or including
the full effects of the Factor 'geom' if only a few of the dummy variables
are actually included in the model.
# here is the model run with all the potential explanatory variables
('geom' is the FACTOR with 7 levels)
> fish.env <-
rda(fish.h~coral_cover+macroalgae+turf_algae_sqrt+ccc_4thrt+rubble_sqrt+reef_slope_sqrt+rugosity
+exposure+min_d_sqrt+d_range+chl_a_log+popn_density_4throot+fp+protection
+geom,data=env.factor)
# collinearity assessment - the results favour dropping 3 of the 'geo'
dummy variables leaving only 2 for the model.
> vif.cca(fish.env)
coral_cover macroalgae turf_algae_sqrt
ccc_4thrt rubble_sqrt reef_slope_sqrt
2.688656 3.099972 2.849219
1.771637 2.411291 2.418953
rugosity exposure min_d_sqrt
d_range chl_a_log popn_density_4throot
2.967752 3.433961 2.587696
2.643991 3.626107 4.571781
fp protection geomgeo_bl
geomgeo_cbrc geomgeo_isefr geomgeo_isprc
4.059624 3.195210 4.329852
12.657270 9.570015 12.052385
geomgeo_lefr geomgeo_oefr
7.347812 17.090731
# here I have kept all the 'geom' dummy variables to submit to forward
selection and it only selects 3 of them, hence it doesnt feel that I am
actually including a Factor 'geom' in the model but rather just a few
individual dummy variables?
> forward.sel(fish.h,env.dummy3,adjR2thresh=R2a.all_fish_env)
Testing variable 1
Testing variable 2
Testing variable 3
Testing variable 4
Testing variable 5
Testing variable 6
Testing variable 7
Testing variable 8
Procedure stopped (alpha criteria): pvalue for variable 8 is 0.092000
(superior to 0.050000)
variables order R2 R2Cum AdjR2Cum F pval
1 exposure 8 0.07086240 0.0708624 0.04925455 3.279475 0.001
2 fp 13 0.04756799 0.1184304 0.07645089 2.266248 0.002
3 geo_isefr 16 0.04571706 0.1641474 0.10298750 2.242500 0.003
4 chl_a_log 11 0.03812686 0.2022743 0.12250174 1.911778 0.009
5 geo_bl 17 0.03423972 0.2365140 0.13863121 1.749016 0.008
6 geo_isprc 21 0.03384005 0.2703541 0.15514682 1.762391 0.012
7 reef_slope_sqrt 6 0.03466752 0.3050216 0.17353919 1.845666 0.007
Any advice appreciated
cheers
Andy
--
Andrew Halford Ph.D
Research Scientist (Kimberley Marine Parks)
Dept. Parks and Wildlife
Western Australia
Ph: +61 8 9219 9795
Mobile: +61 (0) 468 419 473
[[alternative HTML version deleted]]
More information about the R-sig-ecology
mailing list