[R] Help with parsing a data file

Peter Alspach PAlspach at hortresearch.co.nz
Thu Mar 6 20:41:19 CET 2008


Sean

I'm sure there are many ways of doing this.  I assume you have read the
data in R as a data.frame with 24 columns and 2+1+13+(1+13)*n rows,
where n is the number of years, and that you want a data.frame with 25
columns (one extra for year) and 13*n rows (although I am not sure why
13 MOnths) and the columns named appropriately.

#First create your new data.frame
newDF <- as.data.frame(matrix(NA, (nrow(oldDF)-15)/14, 25,
                       dimnames=list(NULL, c('year', oldDF[3,]))))
#Now fill the year column (column 1)
newDF[,1] <- rep(oldDf[seq(16, nrow(oldDF), 14),1], each=13)
#And finally deal with the data
newDF[,-1] <- oldDF[-(1:15),][c(F, rep(T,13)),]

The above is untested and not guaranteed to work, but hopefully is
enough to get you going.  If not, you could get back to me privately.

Peter Alspach

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of sean
> Sent: Friday, 7 March 2008 8:06 a.m.
> To: r-help at r-project.org
> Subject: [R] Help with parsing a data file
> 
> Hi All,
> 
> I need to parse data from a file, example shown below.  The 
> first two lines can be skipped, the third line contains the 
> column names.  The next 13 lines can be skipped.  The next 
> line "1991" is a year value, with the following 13 values 
> data for that year.  The file then repeats this format with 
> (year, 13 lines of data for that year).  I would ideally like 
> to end up with an array/list/vector of the block of 13 
> values, indexed by year, each block using the column names 
> given on the third line.
> 
> If anyone has any good ideas on how to do this in R, pls. let me know.
> 
> Thanks,
> Sean
> 
> --------------------------------------------------------------
> --------------------------------------------------------------
> -----------------------------------------------------------
> 725280 BUFFALO NIAGARA INTL A NY  -5  N42 56  W078 44   215   988
>  1991-2005
>  MO AVGLO FL SDGLO AVDIR FL SDDIR AVDIF FL SDDIF AVETR AETRN  TOT  OPQ
> H2O   TAU  MAX_T  MIN_T  AVG_T  AVGDT  RH  HTDD  CLDD AVWS
>   1  1336 K5   222  1534 K7   676   837 K5    72  3806 13256  
> 8.4  8.1  0.83
> 0.09  -0.52  -7.40  -3.86  -3.36  75   691     0  5.5
>   2  2261 K5   400  2691 K7  1026  1129 K5    74  5330 14714  
> 7.6  7.1  0.79
> 0.10   0.97  -6.67  -2.76  -1.85  73   599     0  5.1
>   3  3249 K5   413  3207 K7   852  1578 K5   118  7428 16443  
> 7.2  6.7  0.98
> 0.12   4.96  -3.21   0.93   2.04  71   541     0  4.9
>   4  4460 K5   570  4051 K6  1045  1951 K5   130  9509 18140  
> 6.6  6.1  1.33
> 0.13  12.18   2.68   7.39   8.54  67   328     1  4.7
>   5  5484 K5   518  4529 K6   801  2408 K5   142 10999 19523  
> 6.1  5.4  1.87
> 0.15  18.77   8.68  13.83  15.07  69   154    12  4.5
>   6  6046 K5   383  5011 K6   671  2567 K5   166 11616 20177  
> 5.7  4.9  2.64
> 0.16  24.05  14.52  19.47  20.66  71    34    63  4.1
>   7  5793 K5   529  4734 K6   884  2537 K5   127 11250 19734  
> 5.6  4.9  2.97
> 0.16  26.10  16.92  21.70  22.90  71     5   104  4.1
>   8  5057 K5   417  4390 K6   693  2245 K5    94  9974 18430  
> 5.6  4.9  2.92
> 0.15  25.61  16.37  21.10  22.56  73     9    91  3.6
>   9  4001 K5   458  3864 K6   826  1797 K5   105  8078 16803  
> 5.6  5.0  2.36
> 0.13  21.73  12.20  17.06  18.68  73    71    30  3.9
>  10  2502 K5   254  2564 K7   584  1306 K5    88  5948 15098  
> 6.3  5.8  1.67
> 0.11  14.89   6.40  10.71  12.16  72   241     3  4.3
>  11  1395 K5   198  1394 K7   492   887 K5    47  4170 13545  
> 7.9  7.5  1.30
> 0.10   8.37   1.42   4.91   5.82  73   403     0  5.0
>  12  1120 K5   173  1391 K7   475   701 K5    52  3351 12733  
> 7.9  7.7  0.94
> 0.09   2.41  -3.92  -0.70  -0.03  75   592     0  5.0
>  13  3559 K5   201  3280 K7   383  1662 K5    50  7622 16550  
> 6.7  6.2  1.72
> 0.12  13.29   4.83   9.15  10.27  72  3668   304  4.5
>  1991
>   1  1313 I5   637  1374 I6  1636   832 I5   169  3800 13249  
> 8.2  7.8  0.75
> 0.07  -0.09  -6.67  -3.46  -2.94  73   673     0  5.9
>   2  1875 I5   887  1767 I6  2080  1137 I5   263  5310 14694  
> 8.3  7.6  0.85
> 0.08   2.44  -3.84  -0.61   0.15  73   533     0  5.9
>   3  3205 I5  1520  3371 I6  3133  1458 I5   392  7395 16417  
> 6.7  6.1  1.12
> 0.10   7.23  -1.17   2.75   3.75  70   474     0  5.3
>   4  3999 I5  1911  3451 I6  3501  1918 I5   521  9482 18116  
> 6.9  5.9  1.60
> 0.12  14.46   5.65   9.91  11.04  68   250     2  5.4
>   5  5968 I5  1854  5369 I6  2936  2296 I5   437 10983 19506  
> 6.1  4.5  2.46
> 0.14  23.15  12.56  17.85  19.12  68    81    66  4.8
>   6  6988 I5  1577  6761 I6  2983  2288 I5   604 11614 20176  
> 4.8  3.0  2.42
> 0.15  26.09  14.95  20.80  22.28  64    14    80  4.3
>   7  6364 I5  1538  5779 I6  2799  2404 I5   568 11262 19749  
> 5.0  3.7  2.89
> 0.16  27.17  16.96  22.43  23.77  66     1   116  4.4
>   8  5407 I5  1478  5114 I6  2693  2106 I5   527  9999 18451  
> 4.8  4.0  2.91
> 0.18  26.64  16.49  21.44  23.08  73     2   102  4.2
>   9  4482 I5  1010  4126 I6  1953  2033 I5   415  8109 16830  
> 5.8  4.6  2.24
> 0.19  22.05  10.98  16.67  18.53  66    97    42  4.3
>  10  2534 I5   864  2419 I6  1859  1396 I5   289  5978 15123  
> 6.1  5.3  1.83
> 0.20  16.44   6.92  11.72  13.21  72   213     7  4.4
>  11  1264 I5   716  1059 I6  1733   851 I5   206  4190 13565  
> 8.3  8.0  1.33
> 0.21   7.63   0.44   3.94   4.82  77   429     0  5.1
>  12   976 I5   423   826 I6  1172   714 I5   156  3354 12738  
> 7.6  7.2  0.98
> 0.22   3.40  -4.20  -0.34   0.21  78   581     0  5.6
>  13  3698 I5  2146  3451 I6  2002  1619 I5   629  7623 16551  
> 6.6  5.6  1.78
> 0.15  14.72   5.76  10.26  11.42  71  3347   415  5.0
>  1992
>   1  1149 I5   496   701 I6   919   896 I5   231  3791 13236  
> 8.5  8.1  0.84
> 0.24   0.68  -6.20  -2.60  -1.69  79   654     0  5.4
>   2  1580 I5   708   898 I6  1469  1198 I5   255  5328 14708  
> 8.2  7.7  0.86
> 0.27   1.26  -6.19  -2.40  -1.56  78   603     0  4.7
>   3  2968 I5  1429  2145 I6  2037  1760 I5   452  7449 16457  
> 7.3  6.7  0.97
> 0.29   3.82  -4.35  -0.11   1.01  70   577     0  4.8
>   4  4050 I5  1812  2937 I6  2634  2146 I5   404  9527 18154  
> 7.3  6.4  1.41
> 0.29  10.64   2.33   6.40   7.50  71   356     1  4.1
>   5  5654 I5  1935  4311 I6  2843  2695 I5   557 11009 19528  
> 5.4  4.3  1.74
> 0.29  19.79   8.17  14.13  15.66  66   148    13  3.8
>   6  6170 I5  2120  4608 I6  3029  2877 I5   695 11617 20176  
> 5.3  4.1  2.17
> 0.28  22.76  11.91  17.63  19.06  65    56    26  4.0
>   7  4879 I5  1816  2795 I6  1915  2835 I5   595 11242 19729  
> 7.2  6.3  2.99
> 0.27  23.05  15.33  19.23  20.19  75    18    44  4.4
>   8  5168 I5  1720  4473 I6  2922  2256 I5   444  9959 18415  
> 5.8  4.9  2.64
> 0.24  23.28  14.62  19.05  20.37  72    26    46  4.4
>   9  4094 I5  1361  3741 I6  2382  1893 I5   375  8058 16789  
> 5.8  4.5  2.50
> 0.21  21.25  11.59  16.56  18.13  72    84    27  4.5
>  10  2499 I5  1177  2228 I6  1904  1393 I5   311  5928 15081  
> 6.1  5.6  1.45
> 0.18  13.37   4.21   8.81  10.50  70   296     0  4.7
>  11  1134 I5   680   731 I6  1287   849 I5   249  4156 13533  
> 8.5  8.1  1.39
> 0.15   7.58   1.22   4.38   5.30  77   418     0  4.8
>  12  1048 I5   508  1136 I6  1428   687 I5   146  3348 12729  
> 7.8  7.3  0.96
> 0.14   3.30  -3.43  -0.06   0.83  69   570     0  5.1
>  13  3366 I5  1883  2559 I6  1489  1790 I5   790  7618 16545  
> 6.9  6.1  1.66
> 0.24  12.56   4.10   8.42   9.61  72  3806   157  4.6
> ...
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

The contents of this e-mail are privileged and/or confidential to the named
 recipient and are not to be used by any other person and/or organisation.
 If you have received this e-mail in error, please notify the sender and delete
 all material pertaining to this e-mail.



More information about the R-help mailing list