[R] reading long matrix

Liaw, Andy andy_liaw at merck.com
Thu Dec 22 20:13:18 CET 2005


Here's one possibility, if you know the number of species and the numbers of
rows and columns before hand, and the dimension for all species are the
same.

readSpeciesMap <- function(fname, nspecies, nr, nc) {
    spcnames <- character(nspecies)
    spcdata <- array(0, c(nc, nr, nspecies))
    ## open the file for reading, and close it upon exit.
    f <- file(fname, open="r")
    on.exit(close(f))
    for (i in seq(along=spcnames)) {
        ## read the name
        spcnames[i] <- readLines(f, 1)[[1]]
        ## read the grid
        spcdata[, , i] <- as.numeric(unlist(strsplit(readLines(f, nr), "")))
        ## pick up the empty line
        readLines(f, 1)
    }
    ## replace the 9s with NAs
    spcdata[spcdata == 9] <- NA
    dimnames(spcdata)[[3]] <- spcnames
    ## "transpose" the array in each species
    aperm(spcdata, c(2, 1, 3))
}

Using the example you supplied (saved in the file "species.txt"):

> readSpeciesMap("species.txt", 3, 6, 9)
, ,   SPECIES1

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,]   NA   NA   NA    0    0    1    0   NA   NA
[2,]   NA    0    0    1    1    0    1    0   NA
[3,]    0    1    1    1    0    1    0    0    0
[4,]   NA    0    1    1    0    0    1    0    1
[5,]    1    1    0    1    0    0    0    1   NA
[6,]   NA    0    1    1    1    0    0    1   NA

, ,   SPECIES2

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,]   NA   NA   NA    0    0    0    0   NA   NA
[2,]   NA    0    0    1    1    0    1    1   NA
[3,]    0    1    1    1    0    1    1    0    0
[4,]   NA    0    1    0    1    0    1    0    1
[5,]    1    1    0    0    0    0    0    1   NA
[6,]   NA    0    0    0    0    0    0    1   NA

, ,   SPECIES3

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,]   NA   NA   NA    0    0    1    0   NA   NA
[2,]   NA    0    0    1    0    0    1    0   NA
[3,]    0    1    1    1    0    0    0    1    0
[4,]   NA    0    1    1    0    0    1    0    0
[5,]    1    1    0    1    0    0    0    1   NA
[6,]   NA    0    1    1    1    0    0    1   NA

Andy


From: Colin Beale
> 
> Hi,
> 
> I'm needing some help finding a function to read a large text 
> file into an array in R. The data are essentially presence / 
> absence / na data for many species and come as a grid with 
> each species name (after two spaces) at the beginning of the 
> matrix defining the map for that species. An excerpt could 
> therefore be:
> 
>   SPECIES1
> 999001099
> 900110109
> 011101000
> 901100101
> 110100019
> 901110019
> 
>   SPECIES2
> 999000099
> 900110119
> 011101100
> 901010101
> 110000019
> 900000019
> 
>   SPECIES3
> 999001099
> 900100109
> 011100010
> 901100100
> 110100019
> 901110019
> 
> where 9 is actually na, 0 is absence and 1 presence. The 
> final array I want to create should have dimensions that are 
> the x and y coordinates and the number of species (known in 
> advance). (In this example dim = c(9,6,3)). It would be sort 
> of neat if the code could also read the species name into the 
> appropriate names attribute, but this is a refinement that I 
> could probably do if someone can help me read the data into R 
> and into an array in the first place. I'm currently thinking 
> a line by line approach using readLines might be the best 
> option, but I've got a very long file - well over 100 
> species, each a matrix of 70 x 100 datapoints. making this 
> option rther time consuming, I expect - especially as the 
> next dataset has 1300 species and a much larger grid...
> 
> Any hints would be gratefully recieved.
> 
> Colin Beale
> Macaulay Land Use Research Institute
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>




More information about the R-help mailing list