[R] Reading an upper triangular matrix
Thomas W Blackwell
tblackw at umich.edu
Tue Nov 11 05:11:04 CET 2003
Kjetil -
Frankly, your file would be much, much easier to read
if it didn't have a row name at the beginning of each
line. Any chance you can edit it to remove those ?
Then, I think you could read in the numeric data with
just one call to scan:
mat <- matrix(0, 21, 21)
mat[row(mat) >= col(mat)] <- scan("filename", skip=1)
Paises <- t(mat)
(Note that the upper-tri matrix gets transposed, in
effect, when I read it in row-wise. So I transpose
it back to upper-tri form in assigning it to 'Paises".)
Last, you will need to read in the column names and
assign these to both dimnames of Paises, but it's
clear that you already know how to do that.
Seems to me that you're going through a great deal of
unnecessry gyrations by trying to use a "connection"
rather than just pass the literal filename to scan().
I've never understood why people do that.
- tom blackwell - u michigan medical school - ann arbor -
On Mon, 10 Nov 2003 kjetil at entelnet.bo wrote:
> Hola!
>
> I have data in the form of a symmetric distance matrix, in the file I
> have recorded only the upper triangular part, with diagonal. The
> matrix is 21x21, and the file have row and col names, and some other
> information. I am trying to read with the following code (I tried
> many variations on it, but all give the same error). The items in the
> data file is delimited by white space.
>
> (Part of) script to read:
>
> myfile <- file("Paises.dat", open="r")
> # opens a connection which stays open until closed by
> close(myfile)
> name <- readLines(con=myfile, n=1)
> varnames <- scan( myfile, what=character(0), nlines=1 )
>
> stopifnot( length(varnames) == 21 )
> Paises <- matrix(0, 21, 21)
> colnames(Paises) <- varnames
> rownames(Paises) <- varnames
> for (i in 1:21) {
> temp <- scan(myfile, what=list("a", rep(0,22-i) ), nlines=1,
> sep="")
> Paises[i, i:21] <- temp[[2]]
> }
>
> I get the following result:
>
> > source("Paises.R", echo=TRUE)
>
> > myfile <- file("Paises.dat", open = "r")
>
> > name <- readLines(con = myfile, n = 1)
>
> > varnames <- scan(myfile, what = character(0), nlines = 1)
> Read 21 items
>
> > stopifnot(length(varnames) == 21)
>
> > Paises <- matrix(0, 21, 21)
>
> > colnames(Paises) <- varnames
>
> > rownames(Paises) <- varnames
>
> > for (i in 1:21) {
> temp <- scan(myfile, what = list("a", rep(0, 22 - i)), nlines =
> 1,
> sep = "")
> Paises[i, i:21] <- temp[[2]]
> }
> Read 11 records
> Error in "[<-"(`*tmp*`, i, i:21, value = temp[[2]]) :
> number of items to replace is not a multiple of replacement
> length
> > i
> [1] 1
> > temp
> [[1]]
> [1] "Bolivia" "1" "2" "3" "2" "4" "4"
>
> [8] "6" "6" "8" "8"
>
> [[2]]
> [1] 0 2 3 2 3 5 5 6 6 7 8
>
> >
>
> While I am asking only for one character string, multiple items are
> read as strings! What is happening?
>
> Kjetil Halvorsen
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
More information about the R-help
mailing list