[R-sig-Geo] Handling many large ASCII files and make predictions based on valuelocations

Tobias Reiners Tobias.Reiners at bio.uni-giessen.de
Wed Nov 10 19:28:35 CET 2010


Hi,

I'm working on german wide species distribution models.
I'm quite familiar with R but nowadays I'm running out of ideas.
As we mostly work with ArcGIS large Rasterfiles were no problem for us.

But since we use Boosted Regression Trees I'm not able to get my 
suitability values with
simple map algepra. We have to make predictions based on a data.frame.
In the Beginning we used the code from Elith et al.:
http://onlinelibrary.wiley.com/doi/10.1111/j.1365-2656.2008.01390.x/full
in appendix 3 you get help how to handle larger maps.
The code is quite long...it is based mainly on scan() and reading in the 
map values
in tiles, assign() it to an object and building a data.frame for prediction.
After creating the header of the ASCII file you repeat the procedure and 
append it
loop by loop.-->script at the and of the mail

Some useful infos.
I have 22 predictors as maps saved as ASCII files.
Each is on a resolution of 25mx25m for hole germany.
This makes 22 files with 840.618.425 values.
I will take weeks to handle that.

What is the best way to handle my Grids and  get fast access to all values.

I hope somebody had similar experiences and I can hopefully tell
my boss next week that I have a solution.

Thanks in advance

N<-371
rep.rows <- c(rep(90,371))
for(i in 1:length(grid.names)){
                                cat("starting loop with 
Dataset",i,grid.names[i],"\n")
                                
assign(variable.names[i],scan(file=grid.names[i],dec =",",
                                skip=6, na.string = "-9999",nlines = 
rep.rows[1]),pos=1)
                                }

preddat1                          
<-data.frame(land_de,iji,p1,p3,p4,p5,p6,p7,p8,p9,p10,p11,p21,p22,dist_wa,dist_ge,shdi,dgm_d,prec_a1,tmit_a1,X) 

preddat1$land_de            <- factor(preddat1$land_de, levels 
=levels(model.data1$land_de))
preddat                            <- preddat1[!is.na(mask),]

gbm.predict.grids(hamster.tc5.lr005.simp, preddat, want.grids = T,
                                sp.name ="Vogel_pred",pred.vec = 
rep(-9999,25175*rep.rows[1]), filepath = "F:/out/",
                                num.col = 25175, num.row = 33390, xll = 
5234506, yll = 3277362, cell.size = 25, no.data = -9999,
                                plot=T, full.grid = F, part.number = 1, 
part.row = rep.rows[1],preds2R = F)







N<-371
rep.rows <- c(rep(90,371))
for(i in 1:length(grid.names)){
    cat("starting loop with Dataset",i,grid.names[i],"\n")
    assign(variable.names[i],scan(file=input,dec =",",
    skip=6, na.string = "-9999",nlines = rep.rows[1]),pos=1)}

preddat1                     
<-data.frame(land_de,iji,p1,p3,p4,p5,p6,p7,p8,p9,p10,p11,p21,p22,dist_wa,dist_ge,shdi,dgm_d,prec_a1,tmit_a1,X) 

preddat1$land_de        <- factor(preddat1$land_de, levels 
=levels(model.data1$land_de))
preddat                            <- preddat1[!is.na(mask),]

gbm.predict.grids(hamster.tc5.lr005.simp, preddat, want.grids = T, sp.name =
"Vogel_pred2",pred.vec = rep(-9999,25175*rep.rows[1]), filepath = "F:/out/",
num.col = 25175, num.row = 33390, xll = 5234506, yll = 3277362, 
cell.size = 25, no.data = -9999,
plot=T, full.grid = F, part.number = 1, part.row = rep.rows[1],preds2R = F)

for(i in 2:N){
    cat("starting loop with part",i,"Processed: 
",round((((i/N)*100))),"%  \n")
    for(j in 1:length(grid.names)){
        cat("Processed: 
",round((((i/N)*100)))+round((j/length(grid.names)/10),2),grid.names[j],"%  
\n")
        
assign(variable.names[j],scan(file=paste("F:/Bird_Climate/",grid.names[j],sep=""),dec 
= ",",
        skip=(6 +sum(rep.rows[1:(i-1)])),quiet = TRUE, na.string = 
"-9999", nlines = rep.rows[i]),
        pos=1)}
    preddat1                 
<-data.frame(land_de,iji,p1,p3,p4,p5,p6,p7,p8,p9,p10,p11,p21,p22,dist_wa,dist_ge,shdi,dgm_d,prec_a1,tmit_a1,X) 

    preddat1$land_de     <- factor(preddat1$land_de, levels 
=levels(model.data1$land_de))
    preddat                    <- preddat1[!is.na(mask),]
    gbm.predict.grids(hamster.tc5.lr005.simp, preddat, want.grids = T, 
sp.name =
    "Vogel_pred2",pred.vec = rep(-9999,25175*rep.rows[i]), filepath = 
"F:/out/",
    num.col = 25175, full.grid = F, part.number = i,part.row = 
rep.rows[i],header = F,preds2R = F)
                }

-- 
Tobias Erik Reiners
Mammalian Ecology Group

Justus-Liebig-University
IFZ - Department of Animal Ecology
Heinrich-Buff-Ring 26
D-35392 Giessen
Germany

Phone:        +49 (0) 641 / 99 - 35761
Fax.:         +49 (0) 641 / 99 - 35709

www.uni-giessen.de/cms/mammalian-ecology



More information about the R-sig-Geo mailing list