[R] Restructuring Hadley CET data

Gabor Grothendieck ggrothendieck at gmail.com
Sat Apr 28 01:31:25 CEST 2007


Here is a bit more.  It creates a zoo series.

URL <- "http://hadobs.metoffice.com/hadcet/cetdl1772on.dat"
DF <- read.table(URL, col.names = c("Year", "DoM", 1:12), check = FALSE)

library(reshape)
long <- melt(DF, c("Year", "DoM"))
long <- subset(long, value != -999)

library(zoo)
dd <- as.Date(with(long, paste(Year, variable, DoM, sep = "-")))
z <- zoo(long$value, dd)


On 4/27/07, hadley wickham <h.wickham at gmail.com> wrote:
> Ooops, add library(reshape) before that, and see
> http://had.co.nz/reshape for more info
>
> Hadley
>
> On 4/27/07, hadley wickham <h.wickham at gmail.com> wrote:
> > Hi Ted,
> >
> > melt(df, id=c("year","DoM"))
> > should get you started.
> >
> > Hadley
> >
> > On 4/27/07, Ted Harding <ted.harding at nessie.mcc.ac.uk> wrote:
> > > Hi Folks,
> > > I have a nasty data restructuring problem!
> > >
> > > I can think of one or two really clumsy ways of doing it
> > > with 'for' loops and the like, but I can't think of a
> > > *neat* way of doing it in R.
> > >
> > > The data are the Hadley Centre "Central England Temperature"
> > > series, daily from 01/01/1772 to 31/03/2007, and can be
> > > viewed/downloaded at
> > >
> > >   http://hadobs.metoffice.com/hadcet/cetdl1772on.dat
> > >
> > > and the structure is as follows:
> > >
> > > Year DoM Jan  Feb Mar  Apr May  Jun Jul Aug  Sep Oct  Nov Dec
> > > -------------------------------------------------------------
> > > 1772  1   32  -15  18   25  87  128 187 177  105 111   78 112
> > > 1772  2   20    7  28   38  77  138 154 158  143 150   85  62
> > > 1772  3   27   15  36   33  84  170 139 153  113 124   83  60
> > > 1772  4   27  -25  61   58  96   90 151 160  173 114   60  47
> > > 1772  5   15   -5  68   69 133  146 179 170  173 116   83  50
> > > 1772  6   22  -45  51   77 113  105 175 198  160 134  134  42
> > > .................................
> > > .................................
> > > 1772 27    0   46  66   74  77  198 156 144   76 104   45   5
> > > 1772 28   15   77  86   64 116  167 151 155   66  84   60  10
> > > 1772 29  -33   56  83   50 113  131 170 182  135 140   63  12
> > > 1772 30  -10 -999  66   77 121  122 179 163  143 143   55  15
> > > 1772 31   -8 -999  46 -999 108 -999 168 144 -999 145 -999  22
> > > 1773  1   20    0  79   13  93  174 104 151  171 131   68  55
> > > 1773  2   10   17  71   25  65  109 128 184  164  91   34  75
> > > 1773  3    5  -28  94   70  41   79 135 192  149 101   78  85
> > > 1773  4    5  -23  99  107  49  107 144 173  144  98   86  83
> > > 1773  5  -28  -30  76   65  83  128 144 182  116  98   66  38
> > > .................................
> > >
> > > "DoM" is Day of Month, 1-31 for each month ("short" months
> > > get entries -999 on missing days).
> > >
> > > So each year is a block of 31 lines and 14 columns, pf
> > > which the last 12 are Temperature (in 10ths of a degreeC),
> > > each column a month, running down each column for the
> > > 31 days of the month in that year.
> > >
> > > What I want to do is convert this into a 4-column format:
> > >
> > >   Year, Month, DoM, Temp
> > >
> > > with a separate row for each consecutive day from 01/01/1772
> > > to 31/02/2007, and omitting days which have a "-999" entry
> > > (OK I still have to check that "-999" is only used for DoMs
> > > which don't exist, and don't also indicate that a Temperature
> > > may be missing for some other reason; but I believe the series
> > > to be complete).
> > >
> > > What it boils down to is stacking the 12 31-day Temperature
> > > columns on top of each other in each year, filling in the
> > > Year, Month, DoM, and stacking the results for consecutive
> > > years on top of each other (after which one can strike out
> > > the "-999"s). Hence, really clunky for-loops!
> > >
> > > Any really *neat* ideas for this?
> > >
> > > With thanks,
> > > Ted.
> > >
> > > --------------------------------------------------------------------
> > > E-Mail: (Ted Harding) <ted.harding at nessie.mcc.ac.uk>
> > > Fax-to-email: +44 (0)870 094 0861
> > > Date: 27-Apr-07                                       Time: 22:24:51
> > > ------------------------------ XFMail ------------------------------
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list