[R] mean of each month in data
David L Carlson
dcarlson at tamu.edu
Mon Dec 17 19:45:30 CET 2012
A slight modification to Rui's answer would give you a list containing
separate matrices for each station:
> aggwide <- reshape(agg, direction = "wide", idvar = c("st", "month"),
+ timevar = "year", v.names = "population")
> aggsplit <- split(aggwide[,colnames(aggwide)!="st"], aggwide$st)
> aggsplit
$Sa
month population.1955 population.1956 population.1957 population.1958
1 1 2.400000 NA NA NA
2 2 2.400000 NA NA NA
3 3 2.266667 NA NA NA
4 4 NA 2.4 NA NA
5 5 NA 2.4 NA NA
6 6 NA 2.4 NA NA
7 7 NA NA 2.400000 NA
8 8 NA NA 2.400000 NA
9 9 NA NA 2.266667 NA
10 10 NA NA 2.400000 2.4
12 11 NA NA NA 2.4
13 12 NA NA NA 2.4
population.1966 population.1967 population.1968 population.1969
1 NA NA NA NA
2 NA NA NA NA
3 NA NA NA NA
4 NA NA NA NA
5 NA NA NA NA
6 NA NA NA NA
7 NA NA NA NA
8 NA NA NA NA
9 NA NA NA NA
10 NA NA NA NA
12 NA NA NA NA
13 NA NA NA NA
$Ta
month population.1955 population.1956 population.1957 population.1958
14 1 NA NA NA NA
15 2 NA NA NA NA
16 3 NA NA NA NA
population.1966 population.1967 population.1968 population.1969
14 2.400000 2.4 2.400000 2.4
15 2.400000 2.4 2.400000 2.4
16 2.266667 2.4 2.266667 2.4
----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Rui Barradas
> Sent: Monday, December 17, 2012 12:06 PM
> To: eliza botto
> Cc: r-help at r-project.org
> Subject: Re: [R] mean of each month in data
>
> Hello,
>
> Something like this?
>
>
> dat <-
> structure(list(st = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
> ), .Label = c("Sa", "Ta"), class = "factor"), year = c(1966L,
> 1966L, 1966L, 1966L, 1966L, 1966L, 1966L, 1966L, 1966L, 1967L,
> 1967L, 1967L, 1967L, 1967L, 1967L, 1967L, 1967L, 1967L, 1968L,
> 1968L, 1968L, 1968L, 1968L, 1968L, 1968L, 1968L, 1968L, 1969L,
> 1969L, 1969L, 1969L, 1969L, 1969L, 1969L, 1969L, 1969L, 1955L,
> 1955L, 1955L, 1955L, 1955L, 1955L, 1955L, 1955L, 1955L, 1956L,
> 1956L, 1956L, 1956L, 1956L, 1956L, 1956L, 1956L, 1956L, 1957L,
> 1957L, 1957L, 1957L, 1957L, 1957L, 1957L, 1957L, 1957L, 1957L,
> 1958L, 1958L, 1958L, 1958L, 1958L, 1958L, 1958L, 1958L), month = c(1L,
> 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L,
> 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L,
> 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L,
> 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 10L,
> 10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L), day = c(1L, 2L, 3L,
> 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
> 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
> 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
> 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
> 2L, 3L, 1L, 2L, 3L), population = c(2.4, 2.4, 2.4, 2.4, 2.4,
> 2.4, 2.3, 2.2, 2.3, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4,
> 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.3, 2.2, 2.3, 2.4, 2.4, 2.4, 2.4,
> 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.3, 2.2,
> 2.3, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4,
> 2.4, 2.4, 2.4, 2.3, 2.2, 2.3, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4,
> 2.4, 2.4)), .Names = c("st", "year", "month", "day", "population"
> ), class = "data.frame", row.names = c(NA, -72L))
>
>
> agg <- aggregate(population ~ st + year + month, data = dat, FUN =
> mean)
> reshape(agg, direction = "wide",
> idvar = c("st", "month"),
> timevar = "year",
> v.names = "population")
>
>
> Hope this helps,
>
> Rui Barradas
> Em 17-12-2012 17:11, eliza botto escreveu:
> > Dear R users,
> > [in case the format of email is changed or you dont finf it easy to
> understand, i have attached a text file of my question]
> > i have the data in the following format and i want to convert it in
> the
> > format given at the end.
> > Ta ans Sa are the names of certain cities. there are 69 cities in my
> > data.
> > column 1 is representing station name (i am writing the data of only
> > two cities for simplicity but actually, as i wrote, i have 69 cities
> > and the actuall table runs down very deep.)
> > Column 2 represnts the year for which the data is given (Actuall data
> > for each station is of different length but atleast of 24 years).
> > Column 3 and 4 reprent the month and the day of the data. obviously
> > each year has 12 months and each month as different number of days,
> but
> > to make table easily understable only 3 months and 3 days of each
> month
> > are considered. febrary for leap years should also be considered.
> > col5 represents population of that city
> >
> > st. year month day population in million
> > Ta 1966 1 1 2.4
> > Ta 1966 1 2 2.4
> > Ta 1966 1 3 2.4
> > Ta 1966 2 1 2.4
> > Ta 1966 2 2 2.4
> > Ta 1966 2 3 2.4
> > Ta 1966 3 1 2.3
> > Ta 1966 3 2 2.2
> > Ta 1966 3 3 2.3
> > Ta 1967 1 1 2.4
> > Ta 1967 1 2 2.4
> > Ta 1967 1 3 2.4
> > Ta 1967 2 1 2.4
> > Ta 1967 2 2 2.4
> > Ta 1967 2 3 2.4
> > Ta 1967 3 1 2.4
> > Ta 1967 3 2 2.4
> > Ta 1967 3 3 2.4
> > Ta 1968 1 1 2.4
> > Ta 1968 1 2 2.4
> > Ta 1968 1 3 2.4
> > Ta 1968 2 1 2.4
> > Ta 1968 2 2 2.4
> > Ta 1968 2 3 2.4
> > Ta 1968 3 1 2.3
> > Ta 1968 3 2 2.2
> > Ta 1968 3 3 2.3
> > Ta 1969 1 1 2.4
> > Ta 1969 1 2 2.4
> > Ta 1969 1 3 2.4
> > Ta 1969 2 1 2.4
> > Ta 1969 2 2 2.4
> > Ta 1969 2 3 2.4
> > Ta 1969 3 1 2.4
> > Ta 1969 3 2 2.4
> > Ta 1969 3 3 2.4
> > Sa 1955 1 1 2.4
> > Sa 1955 1 2 2.4
> > Sa 1955 1 3 2.4
> > Sa 1955 2 1 2.4
> > Sa 1955 2 2 2.4
> > Sa 1955 2 3 2.4
> > Sa 1955 3 1 2.3
> > Sa 1955 3 2 2.2
> > Sa 1955 3 3 2.3
> > Sa 1956 4 1 2.4
> > Sa 1956 4 2 2.4
> > Sa 1956 4 3 2.4
> > Sa 1956 5 1 2.4
> > Sa 1956 5 2 2.4
> > Sa 1956 5 3 2.4
> > Sa 1956 6 1 2.4
> > Sa 1956 6 2 2.4
> > Sa 1956 6 3 2.4
> > Sa 1957 7 1 2.4
> > Sa 1957 7 2 2.4
> > Sa 1957 7 3 2.4
> > Sa 1957 8 1 2.4
> > Sa 1957 8 2 2.4
> > Sa 1957 8 3 2.4
> > Sa 1957 9 1 2.3
> > Sa 1957 9 2 2.2
> > Sa 1957 9 3 2.3
> > Sa 1957 10 1 2.4
> > Sa 1958 10 2 2.4
> > Sa 1958 10 3 2.4
> > Sa 1958 11 1 2.4
> > Sa 1958 11 2 2.4
> > Sa 1958 11 3 2.4
> > Sa 1958 12 1 2.4
> > Sa 1958 12 2 2.4
> > Sa 1958 12 3 2.4
> > ...
> > ...
> > uptill 69th station
> >
> > i want to convert the data in following format
> >> Ta ## matrix for station Ta
> > 1966 1967 1968 1969
> > AVERAGE OF MONTH 1 AVERAGE OF MONTH 1 AVERAGE OF MONTH 1 AVERAGE OF
> MONTH 1
> > AVERAGE OF MONTH 2 AVERAGE OF MONTH 2 AVERAGE OF MONTH 2 AVERAGE OF
> MONTH 2
> > AVERAGE OF MONTH 3 AVERAGE OF MONTH 3 AVERAGE OF MONTH 3 AVERAGE OF
> MONTH 3
> > ........
> > ........
> > AVERAGE OF MONTH 12 AVERAGE OF MONTH 12 AVERAGE OF MONTH 12 AVERAGE
> OF MONTH 12
> > similar operation are to be done for "Sa" and the remaining 67
> > stations...
> > which means i want to have 69 matrices, in which each column (number
> of
> > columns should be equal to number of years of data) should contain
> 12
> > mean monthly values of population of each year.
> >
> > thanks in advance
> >
> > eliza
> >
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list