[R] Data frame with 3 columns to matrix

David Winsemius dwinsemius at comcast.net
Tue Apr 19 15:32:18 CEST 2011


On Apr 19, 2011, at 8:16 AM, Michael Bach wrote:

> David Winsemius <dwinsemius at comcast.net> writes:
>
>> Perhaps but only if the third row of your example was incorrectly  
>> constructed:
>>> dta <- rd.txt("   x y   z
>> 1 1.00 5 0.5
>> 2 1.02 5 0.7
>> 3 1.04 7 0.1
>> 4 1.06 9 0.4")
>> #rd.txt() is a combo fn of read.table and textConnection
>>
>>> mat <- matrix(NA, ncol=NROW(dta)+1, nrow=NROW(dta)+1)
>>> mat[2:NROW(mat),1] <- dta[["x"]]
>>> mat[1,2:NROW(mat)] <- dta[["y"]]
>>> diag(mat) <- c(NA, dta[["z"]])
>>> mat
>>     [,1] [,2] [,3] [,4] [,5]
>> [1,]   NA  5.0  5.0  7.0  9.0
>> [2,] 1.00  0.5   NA   NA   NA
>> [3,] 1.02   NA  0.7   NA   NA
>> [4,] 1.04   NA   NA  0.1   NA
>> [5,] 1.06   NA   NA   NA  0.4
>>
>>
>
> Thanks for your answer David,
>
> but this yields a diagonal matrix only.  I think I did not make myself
> clear enough.  In the original 3 column data frame, there could have
> been a pair of x and y with identical y's but different x's and z's.
> The way my data source is derived, there is a guarantee that there is
> are no two rows with identical x and y in the original data frame.  In
> the end, x and y serve as a grid, with z values at each point in the
> grid or NA's if there is no z value for a x and y pair.  The number of
> rows in the data frame is then equal to the number of non-NA values in
> the resulting matrix.
>
> Another try, lets assume this original data frame:
>
>  x  y z
> 1 2  5 1
> 2 2  6 1
> 3 3  7 1
> 4 3  8 1
> 5 3  9 1
> 6 5 10 2
> 7 5 11 2
> 8 5 12 2
>
> Then I would like to get
>
>     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
> [1,]   NA    5    6    7    8    9   10   11   12
> [2,]    2    1    1
> [3,]    2
> [4,]    3              1    1    1
> [5,]    3
> [6,]    3
> [7,]    5                             2    2    2
> [8,]    5
> [9,]    5
>
> I left out all the NA's, except the first, where there is no z value,
> say e.g. x=5 and y=8.
>
> Do you see what I mean?

I do, ... now anyway. Your earlier data example had non-integer x and  
y values which made what I will now offer infeasible (or at the very  
least ambiguous). Indexing with decimal numbers does not provoke an  
error and that the truncated value is used.  With integer indices you  
can use a two column matrix as an argument to "["

 > mat <- matrix(NA, nrow=max(dta[[1]])+1, ncol=max(dta[[2]])+1 )
 > mat[data.matrix(dta[,1:2])] <- dta[,3]
 > mat
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA
[2,]   NA   NA   NA   NA    1    1   NA   NA   NA    NA    NA    NA
[3,]   NA   NA   NA   NA   NA   NA    1    1    1    NA    NA    NA
[4,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA
[5,]   NA   NA   NA   NA   NA   NA   NA   NA   NA     2     2     2
[6,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA
      [,13]
[1,]    NA
[2,]    NA
[3,]    NA
[4,]    NA
[5,]    NA
[6,]    NA

I leave the insertion of the first row and columns and removal of the  
extra columns induced by the mismatch of the values and row numbers to  
you, since .....
 > mat[, 4:12]
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
[2,]   NA    1    1   NA   NA   NA   NA   NA   NA
[3,]   NA   NA   NA    1    1    1   NA   NA   NA
[4,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
[5,]   NA   NA   NA   NA   NA   NA    2    2    2
[6,]   NA   NA   NA   NA   NA   NA   NA   NA   NA

-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list