[R] Restructuring Hadley CET data
(Ted Harding)
ted.harding at nessie.mcc.ac.uk
Fri Apr 27 23:25:00 CEST 2007
Hi Folks,
I have a nasty data restructuring problem!
I can think of one or two really clumsy ways of doing it
with 'for' loops and the like, but I can't think of a
*neat* way of doing it in R.
The data are the Hadley Centre "Central England Temperature"
series, daily from 01/01/1772 to 31/03/2007, and can be
viewed/downloaded at
http://hadobs.metoffice.com/hadcet/cetdl1772on.dat
and the structure is as follows:
Year DoM Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
-------------------------------------------------------------
1772 1 32 -15 18 25 87 128 187 177 105 111 78 112
1772 2 20 7 28 38 77 138 154 158 143 150 85 62
1772 3 27 15 36 33 84 170 139 153 113 124 83 60
1772 4 27 -25 61 58 96 90 151 160 173 114 60 47
1772 5 15 -5 68 69 133 146 179 170 173 116 83 50
1772 6 22 -45 51 77 113 105 175 198 160 134 134 42
.................................
.................................
1772 27 0 46 66 74 77 198 156 144 76 104 45 5
1772 28 15 77 86 64 116 167 151 155 66 84 60 10
1772 29 -33 56 83 50 113 131 170 182 135 140 63 12
1772 30 -10 -999 66 77 121 122 179 163 143 143 55 15
1772 31 -8 -999 46 -999 108 -999 168 144 -999 145 -999 22
1773 1 20 0 79 13 93 174 104 151 171 131 68 55
1773 2 10 17 71 25 65 109 128 184 164 91 34 75
1773 3 5 -28 94 70 41 79 135 192 149 101 78 85
1773 4 5 -23 99 107 49 107 144 173 144 98 86 83
1773 5 -28 -30 76 65 83 128 144 182 116 98 66 38
.................................
"DoM" is Day of Month, 1-31 for each month ("short" months
get entries -999 on missing days).
So each year is a block of 31 lines and 14 columns, pf
which the last 12 are Temperature (in 10ths of a degreeC),
each column a month, running down each column for the
31 days of the month in that year.
What I want to do is convert this into a 4-column format:
Year, Month, DoM, Temp
with a separate row for each consecutive day from 01/01/1772
to 31/02/2007, and omitting days which have a "-999" entry
(OK I still have to check that "-999" is only used for DoMs
which don't exist, and don't also indicate that a Temperature
may be missing for some other reason; but I believe the series
to be complete).
What it boils down to is stacking the 12 31-day Temperature
columns on top of each other in each year, filling in the
Year, Month, DoM, and stacking the results for consecutive
years on top of each other (after which one can strike out
the "-999"s). Hence, really clunky for-loops!
Any really *neat* ideas for this?
With thanks,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 27-Apr-07 Time: 22:24:51
------------------------------ XFMail ------------------------------
More information about the R-help
mailing list