[R] data manipulation

Marc Schwartz MSchwartz at MedAnalytics.com
Thu Apr 14 03:15:25 CEST 2005


On Wed, 2005-04-13 at 20:56 -0400, Yoko Nakajima wrote:
> Hello,
> my question is about the data handling.
> 
> I have a data set that is lined as:
> 
> 4 1 17 1 1
>  -5.1536 -0.1668 -2.3412 -0.5062  0.9621  0.3640  0.3678 -0.5081
> -0.2227
>   0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232  0.8673 -0.1033
> -0.0796
>  -0.0341 -0.1716 -0.1801 -0.7014  0.6578  0.5611
> 4 1 17 2 1
>  -5.1536 -0.1668 -2.3412 -0.5062  0.9621  0.3640  0.3678 -0.5081
> -0.2227
>   0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232  0.8673 -0.1033
> -0.0796
>  -0.0341 -0.1716 -0.1801 -0.7014  0.6578  0.5611
> 
> This means that 29 variables are together as a set. You saw two sets
> of them in example. I have about 1000 sets (of 29 variables) in my
> data. When I "scan" this data set, the result comes with 7 columns and
> it is not possible, so far, to read the table by column wise, and thus
> it is not possible to analyze the data. I would like to know whether
> there is a way to solve this problem, say, by arranging columns or
> increasing the number of columns of data matrix by R.
> 
> Also, I would like to know how you could name each column of the data
> so that you could use the individual column separately.

You probably change some default setting in scan(). By default it treats
'white space' as field delimiters.

Using your data above, which I save in file called 'test.dat':

> mat <- matrix(scan("test.dat"), ncol = 29)
Read 58 items

> dim(mat)
[1]  2 29

> colnames(mat) <- paste("Col", 1:29, sep = "")

> mat
     Col1 Col2    Col3    Col4    Col5   Col6    Col7    Col8    Col9
[1,]    4   17  1.0000 -0.1668 -0.5062 0.3640 -0.5081  0.8142 -0.0445
[2,]    1    1 -5.1536 -2.3412  0.9621 0.3678 -0.2227 -0.0389 -0.0578
       Col10   Col11   Col12   Col13   Col14  Col15 Col16 Col17   Col18
[1,] -0.1175  0.8673 -0.0796 -0.1716 -0.7014 0.5611     1     2 -5.1536
[2,] -0.1232 -0.1033 -0.0341 -0.1801  0.6578 4.0000    17     1 -0.1668
       Col19  Col20   Col21   Col22   Col23   Col24   Col25   Col26
[1,] -2.3412 0.9621  0.3678 -0.2227 -0.0389 -0.0578 -0.1232 -0.1033
[2,] -0.5062 0.3640 -0.5081  0.8142 -0.0445 -0.1175  0.8673 -0.0796
       Col27   Col28  Col29
[1,] -0.0341 -0.1801 0.6578
[2,] -0.1716 -0.7014 0.5611

In this case, 'mat' is a matrix with 2 rows and 29 columns.

You can restructure this differently as per your requirements.

HTH,

Marc Schwartz




More information about the R-help mailing list