[R] About data manipulation

P Tennant philipt900 at iinet.net.au
Sun Nov 27 00:42:17 CET 2016


Hi,

It may help that:

aggregate(DF$total, list(DF$note, DF$id, DF$month), mean)

should give you means broken down by time slice (note), id and month. 
You could then subset means for GA or GB from the aggregated dataframe.

Philip

On 27/11/2016 3:11 AM, lily li wrote:
> Hi R users,
>
> I'm trying to manipulate a dataframe and have some difficulties.
>
> The original dataset is like this:
>
> DF
> year   month   total   id     note
> 2000     1         98    GA   1
> 2001     1        100   GA   1
> 2002     2         99    GA   1
> 2002     2         80    GB   1
> ...
> 2012     1         78    GA   2
> ...
>
> The structure is like this: when year is between 2000-2005, note is 1; when
> year is between 2006-2010, note is 2; GA, GB, etc represent different
> groups, but they all have years 2000-2005, 2006-2010, 2011-2015.
> I want to calculate one average value for each month in each time slice.
> For example, between 2000-2005, when note is 1, for GA, there is one value
> in month 1, one value in month 2, etc; for GB, there is one value in month
> 1, one value in month 2, between this time period. So later, there is no
> 'year' column, but other columns.
> I tried the script: DF_GA = aggregate(total~year+month,data=subset(DF,
> id==GA&note==1)), but it did not give me the ideal dataframe. How to do
> then?
> Thanks for your help.
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list