[R] randomForest Species Distribution Modelling

Fionn fionn.farrell at gmail.com
Wed Jun 6 14:18:52 CEST 2012


Hi,
I appologise if this is a rudimentary question and long winded but I just
wanted to let ye know where I'm comming from. I'm new to R and I'm trying to
use the 'randomForest' package to classify and predict. The Error message
that is troubling me is:

> pr<-predict(predictors,rf1, ext=ext)
Error in x[...] <- m : NAs are not allowed in subscripted assignments
In addition: Warning message:
'newdata' had 153595 rows but variable(s) found have 109 rows 


My steps are outlinded below which hopefully will give you insight into
where I'm going horribly wrong.

  
Step 1
I've sampled the environmental raster layers in ArcGIS  giving me a csv file
as follows.

>Samples<-read.csv(("F:/R/Rst_points_10.csv"),head=TRUE,sep=",")
>head (Samples)
>attach(Samples)
  POINTID GRID_CODE        X       Y    Slope   Aspect Curvature Rugosity
Plan_Curv Prof_Curv BS_BPI BS_BPI_S FS_BPI
1       1        74 420420.1 5572854 6.379370 116.5650         5 1.014847     
2.80     -2.20      3      118      2
2       2        96 420460.1 5572834 5.051153 135.0000         0 1.007454     
0.25      0.25     -1      -68      0
3       3        75 420510.1 5572834 0.000000  -1.0000         0 1.000000     
0.00      0.00     -1      -68      0
4       4        76 420610.1 5572804 5.885129 194.0362        -4 1.012384    
-2.00      2.00      3      118      0
5       5        97 429970.1 5572024 1.432096 270.0000        -3 1.004987    
-2.00      1.00     -1      -68      0
6       6        98 429960.1 5571904 1.012750 315.0000         0 1.001247     
0.00      0.00      0      -21      0
  FS_BPI_S Bathy GROUP G1 G2 G3 G4 G5 G6 G7 G8 G9
1      441   -19     8  0  0  0  0  0  0  0  1  0
2      -27   -24     9  0  0  0  0  0  0  0  0  1
3      -27   -24     8  0  0  0  0  0  0  0  1  0
4      -27   -19     8  0  0  0  0  0  0  0  1  0
5      -27   -18     9  0  0  0  0  0  0  0  0  1
6      -27   -18     9  0  0  0  0  0  0  0  0  1

Step 2
I then uploaded the environemtal raster layers and stacked them.

>files <-list.files(("C:/Users/GIS-Modeller/Documents/10m/ASCII"), pattern=
'asc', full.names=TRUE)
>predictors <-stack(files)
> predictors
class       : RasterStack 
dimensions  : 1745, 3909, 6821205, 10  (nrow, ncol, ncell, nlayers)
resolution  : 10, 10  (x, y)
extent      : 417085.1, 456175.1, 5556329, 5573779  (xmin, xmax, ymin, ymax)
coord. ref. : NA 
min values  :       NA -2.1e+09 -2.1e+09 -2.1e+09 -2.1e+09 -2.1e+09 -2.1e+09      
NA       NA       NA 
max values  :      NA 2.1e+09 2.1e+09 2.1e+09 2.1e+09 2.1e+09 2.1e+09     
NA      NA      NA 

Step 3
I then provided the projection. 
projection(predictors)<- "+proj=utm +zone=30 +ellps=WGS84 +datum=WGS84
+units=m +no_defs"

Step 4
I've tried numerous ways to get rid of /relace the NA values.
#na.action<-
#predictors<-predictors[na.rm=FALSE]
#99999->predictors[predictors==NA, ]
#predictors<-predictors[predictors, na.action=na.omit ]
#na.exclude->predictors=NA
#na.omit(predictors)

multiple combinations of these. 

(#99999->predictors[predictors==NA, ]) returned the expected max and min
values for 'predictors' had the NA values not been taken into account
(except for the fact that 99999 was neither a max or min value).
> predictors
class       : RasterBrick 
dimensions  : 1745, 3909, 6821205, 10  (nrow, ncol, ncell, nlayers)
resolution  : 10, 10  (x, y)
extent      : 417085.1, 456175.1, 5556329, 5573779  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=utm +zone=30 +ellps=WGS84 +datum=WGS84 +units=m +no_defs
+towgs84=0,0,0 
values      : in memory
min values  :    -1   -59   -10  -487   -26    -5 -1199   -14   -16     0 
max values  :  358    0   19  863   32   11 2551   16   14   34 

Step 5

create the 'model'/rf.
model<-factor(G1)~ Slope+Aspect+Curvature+Rugosity+ Plan_Curv+ Prof_Curv+
BS_BPI+ BS_BPI_S+ FS_BPI
rf<- randomForest(model)
> rf

Call:
 randomForest(formula = model) 
               Type of random forest: classification
                     Number of trees: 500
No. of variables tried at each split: 3

        OOB estimate of  error rate: 8.26%
Confusion matrix:
   0  1 class.error
0 88  6  0.06382979
1  3 12  0.20000000

Step 6
Begin prediction

>ext = extent(417085.1, 456175.1, 5556329, 5573779)

pr<-predict(predictors,rf1, ext=ext)
Error in x[...] <- m : NAs are not allowed in subscripted assignments
In addition: Warning message:
'newdata' had 153595 rows but variable(s) found have 109 rows 




I thank those that have read this. All  help is extreemly apprecieated.
Cheers
Fionn

--
View this message in context: http://r.789695.n4.nabble.com/randomForest-Species-Distribution-Modelling-tp4632515.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list