[R] summing and combining rows
arun
smartpink111 at yahoo.com
Wed Aug 8 20:20:55 CEST 2012
HI,
From the ?aggregate(),
formula: a formula, such as ‘y ~ x’ or ‘cbind(y1, y2) ~ x1 + x2’,
where the ‘y’ variables are numeric data to be split into
groups according to the grouping ‘x’ variables (usually
factors).
So, I converted your data to factors for the grouping variable, the results are the same.
convert.type1 <- function(obj,types){
for (i in 1:length(obj)){
FUN <- switch(types[i],character = as.character,
numeric = as.numeric,
factor = as.factor)
obj[,i] <- FUN(obj[,i])
}
obj
}
dat2<-convert.type1(dat1,c("factor","factor","factor","factor","factor","factor","factor","factor","numeric","factor","factor"))
str(dat2)
'data.frame': 8 obs. of 11 variables:
$ Data : Factor w/ 1 level "VTM": 1 1 1 1 1 1 1 1
$ Plot : Factor w/ 4 levels "39C16","39F11",..: 1 1 2 2 3 3 4 4
$ Lat : Factor w/ 4 levels "39.54522","39.56214",..: 4 4 3 3 2 2 1 1
$ LatCat : Factor w/ 1 level "Lat6": 1 1 1 1 1 1 1 1
$ Elevation: Factor w/ 3 levels "500","900","1500": 3 3 1 1 3 3 2 2
$ ElevCat : Factor w/ 1 level "Elev1": 1 1 1 1 1 1 1 1
$ Type : Factor w/ 1 level "Conifer": 1 1 1 1 1 1 1 1
$ SizeClass: Factor w/ 2 levels "Class3","Class4": 1 2 1 2 1 2 1 2
$ Stems : num 0 1 0 0 3 1 1 2
$ Area : Factor w/ 3 levels "694.0784","751.5347",..: 2 2 2 2 1 1 3 3
$ Density : Factor w/ 3 levels "0","13.08926",..: 1 3 1 1 1 1 2 1
#Taking out Density will group for the combinations of other factors
aggregate(Stems~Plot+Data+Lat+LatCat+Elevation+Type+Area,data=dat2,sum)
Plot Data Lat LatCat Elevation Type Area Stems
1 39F13 VTM 39.56214 Lat6 1500 Conifer 694.0784 4
2 39F11 VTM 39.57721 Lat6 500 Conifer 751.5347 0
3 39C16 VTM 39.76282 Lat6 1500 Conifer 751.5347 1
4 39F14 VTM 39.54522 Lat6 900 Conifer 763.985 3
#but, it won't go lower than this as there are four levels for Plot and Lat, unless you drop those
aggregate(Stems~Data+LatCat+Elevation+Type,data=dat2,sum)
Data LatCat Elevation Type Stems
1 VTM Lat6 500 Conifer 0
2 VTM Lat6 900 Conifer 3
3 VTM Lat6 1500 Conifer 5
A.K.
----- Original Message -----
From: Christopher R. Dolanc <crdolanc at ucdavis.edu>
To: arun <smartpink111 at yahoo.com>
Cc:
Sent: Wednesday, August 8, 2012 2:00 PM
Subject: Re: [R] summing and combining rows
OK. I can make this work. Thank you for helping me figure this out.
On 8/8/2012 10:49 AM, arun wrote:
> Hello,
>
> I tried with ddply
>
> ddply(dat1,.(Data,Plot,Lat,LatCat,Elevation,Type,Area,Density),summarize,sum(Stems))
> Data Plot Lat LatCat Elevation Type Area Density ..1
> 1 VTM 39C16 39.76282 Lat6 1500 Conifer 751.5347 0.00000 0
> 2 VTM 39C16 39.76282 Lat6 1500 Conifer 751.5347 13.30611 1
> 3 VTM 39F11 39.57721 Lat6 500 Conifer 751.5347 0.00000 0
> 4 VTM 39F13 39.56214 Lat6 1500 Conifer 694.0784 0.00000 4
> 5 VTM 39F14 39.54522 Lat6 900 Conifer 763.9850 0.00000 2
> 6 VTM 39F14 39.54522 Lat6 900 Conifer 763.9850 13.08926 1
>
>
> Results look same as in aggregate.
> Suppose, if you take out density,
>
> ddply(dat1,.(Data,Plot,Lat,LatCat,Elevation,Type,Area),summarize,sum(Stems))
> Data Plot Lat LatCat Elevation Type Area ..1
> 1 VTM 39C16 39.76282 Lat6 1500 Conifer 751.5347 1
> 2 VTM 39F11 39.57721 Lat6 500 Conifer 751.5347 0
> 3 VTM 39F13 39.56214 Lat6 1500 Conifer 694.0784 4
> 4 VTM 39F14 39.54522 Lat6 900 Conifer 763.9850 3
>
> I guess now it is summed.
>
>
>
> A.K.
>
>
>
>
>
>
> ----- Original Message -----
> From: Christopher R. Dolanc <crdolanc at ucdavis.edu>
> To: arun <smartpink111 at yahoo.com>
> Cc:
> Sent: Wednesday, August 8, 2012 1:19 PM
> Subject: Re: [R] summing and combining rows
>
> ok, so it looks like aggregate lists them separately unless everything
> in the 2 rows matches. Below, we have 2 plots where the density is
> different in Class3 than Class4, and these are not summed. Is that your
> understanding?
>
> Thanks for your help.
>
> Chris
>
> On 8/7/2012 4:18 PM, arun wrote:
>> HI,
>>
>> I tried two ways in aggregate. The results are the same.
>> dat1<-read.table(text="
>> Data Plot Lat LatCat Elevation ElevCat Type SizeClass Stems Area Density
>> VTM 39C16 39.76282 Lat6 1500 Elev1 Conifer Class3 0 751.5347 0.00000
>> VTM 39C16 39.76282 Lat6 1500 Elev1 Conifer Class4 1 751.5347 13.30611
>> VTM 39F11 39.57721 Lat6 500 Elev1 Conifer Class3 0 751.5347 0.00000
>> VTM 39F11 39.57721 Lat6 500 Elev1 Conifer Class4 0 751.5347 0.00000
>> VTM 39F13 39.56214 Lat6 1500 Elev1 Conifer Class3 3 694.0784 0.00000
>> VTM 39F13 39.56214 Lat6 1500 Elev1 Conifer Class4 1 694.0784 0.00000
>> VTM 39F14 39.54522 Lat6 900 Elev1 Conifer Class3 1 763.9850 13.08926
>> VTM 39F14 39.54522 Lat6 900 Elev1 Conifer Class4 2 763.9850 0.00000
>> ",sep="",header=TRUE, stringsAsFactors=FALSE)
>>
>>
>>> with(dat1,aggregate(Stems,list(Plot,Data,Lat,LatCat,Elevation,Type,Area,Density),sum))
>> Group.1 Group.2 Group.3 Group.4 Group.5 Group.6 Group.7 Group.8 x
>> 1 39F13 VTM 39.56214 Lat6 1500 Conifer 694.0784 0.00000 4
>> 2 39F11 VTM 39.57721 Lat6 500 Conifer 751.5347 0.00000 0
>> 3 39C16 VTM 39.76282 Lat6 1500 Conifer 751.5347 0.00000 0
>> 4 39F14 VTM 39.54522 Lat6 900 Conifer 763.9850 0.00000 2
>> 5 39F14 VTM 39.54522 Lat6 900 Conifer 763.9850 13.08926 1
>> 6 39C16 VTM 39.76282 Lat6 1500 Conifer 751.5347 13.30611 1
>>> aggregate(Stems~Plot+Data+Lat+LatCat+Elevation+Type+Area+Density,data=dat1,sum)
>> Plot Data Lat LatCat Elevation Type Area Density Stems
>> 1 39F13 VTM 39.56214 Lat6 1500 Conifer 694.0784 0.00000 4
>> 2 39F11 VTM 39.57721 Lat6 500 Conifer 751.5347 0.00000 0
>> 3 39C16 VTM 39.76282 Lat6 1500 Conifer 751.5347 0.00000 0
>> 4 39F14 VTM 39.54522 Lat6 900 Conifer 763.9850 0.00000 2
>> 5 39F14 VTM 39.54522 Lat6 900 Conifer 763.9850 13.08926 1
>> 6 39C16 VTM 39.76282 Lat6 1500 Conifer 751.5347 13.30611 1
>>
>>
>>
>> The rows with 39.57721 and 39.56214 are the same for SizeClass except the Stems #. It got summed. Otherwise, it is giving both Class3 and Class4 values separately.
>>
>> A.K.
>>
>>
>>
>>
>>
>>
>>
>>
>> ----- Original Message -----
>> From: Christopher R. Dolanc <crdolanc at ucdavis.edu>
>> To: arun <smartpink111 at yahoo.com>
>> Cc:
>> Sent: Tuesday, August 7, 2012 6:38 PM
>> Subject: Re: [R] summing and combining rows
>>
>> Hmmm. It looks like it's only giving me the values for Class3, instead
>> of summing, which is why I thought the "+" method might not be the
>> appropriate coding.
>>
>> Here's the code I used:
>>
>>> CH_Con_Elev1SC34a<-
>> aggregate(Stems~Plot+Data+Lat+LatCat+Elevation+Type+Area+Density,
>> data=CH_Con_Elev1SC34, sum)
>>> CH_Con_Elev1SC34b<- data.frame(CH_Con_Elev1SC34a,
>> SizeClass=rep("Class34",))
>>
>> If it helps, attached is a txt file with the data structure.
>>
>> On 8/7/2012 3:00 PM, arun wrote:
>>> Hi,
>>> Not sure why you mentioned "+" doesn't work.
>>> dat1<-read.table(text="
>>> Plot Elevation Area SizeClass Stems
>>> 12 1200 132.4 Class3 0
>>> 12 1200 132.4 Class4 1
>>> 17 2320 209.1 Class3 3
>>> 17 2320 209.1 Class4 5
>>> ",sep="",header=TRUE,stringsAsFactors=FALSE)
>>>
>>> dat2<-aggregate(Stems~Plot+Elevation+Area, data=dat1,sum)
>>> dat3<-data.frame(dat2,SizeClass=rep("Class34",2))
>>> dat3<-dat3[,c(1:3,5,4)]
>>> dat3
>>> # Plot Elevation Area SizeClass Stems
>>> #1 12 1200 132.4 Class34 1
>>> #2 17 2320 209.1 Class34 8
>>>
>>> A.K.
>>>
>>>
>>>
>>>
>>>
>>>
>>> ----- Original Message -----
>>> From: Christopher R. Dolanc <crdolanc at ucdavis.edu>
>>> To: arun <smartpink111 at yahoo.com>
>>> Cc:
>>> Sent: Tuesday, August 7, 2012 5:47 PM
>>> Subject: Re: [R] summing and combining rows
>>>
>>> Thanks for your response. The aggregate method mostly works for me, but
>>> I have numerous other columns that I'd like to keep in the result. So,
>>> if I have something like this:
>>>
>>>
>>> Plot Elevation Area SizeClass Stems
>>> 12 1200 132.4 Class3 0
>>> 12 1200 132.4 Class4 1
>>> 17 2320 209.1 Class3 3
>>> 17 2320 209.1 Class4 5
>>>
>>> How can I make it look like this?
>>>
>>> Plot Elevation Area SizeClass Stems
>>> 12 1200 132.4 Class34 1
>>> 17 2320 209.1 Class34 8
>>>
>>> I see something in ?aggregate about adding columns with a +, but this
>>> doesn't quite work for me.
>>>
>>>
>>> On 8/7/2012 2:32 PM, arun wrote:
>>>> Hi,
>>>>
>>>> Try this:
>>>> dat1<-read.table(text="
>>>> Plot SizeClass Stems
>>>> 12 Class3 1
>>>> 12 Class4 3
>>>> 17 Class3 5
>>>> 17 Class4 2
>>>> ",sep="",header=TRUE, stringsAsFactors=FALSE)
>>>>
>>>>
>>>>
>>>> ddply(dat1,.(Plot), summarize, sum(Stems))
>>>>
>>>> #or
>>>>
>>>>
>>>> dat2<-aggregate(Stems~Plot,data=dat1,sum)
>>>> dat3<-data.frame(dat2,SizeClass=rep("Class34",2))
>>>> dat3
>>>> # Plot Stems SizeClass
>>>> #1 12 4 Class34
>>>> #2 17 7 Class34
>>>>
>>>>
>>>> A.K.
>>>>
>>>> ----- Original Message -----
>>>> From: Christopher R. Dolanc <crdolanc at ucdavis.edu>
>>>> To: r-help at r-project.org
>>>> Cc:
>>>> Sent: Tuesday, August 7, 2012 1:47 PM
>>>> Subject: [R] summing and combining rows
>>>>
>>>> Hello,
>>>>
>>>> I have a data set that needs to be combined so that rows are summed by a group based on a certain variable. I'm pretty sure rowsum() or rowsums() can do this but it's difficult for me to figure out how it will work for my data based on the examples I've read.
>>>>
>>>> My data are structured like this:
>>>>
>>>> Plot SizeClass Stems
>>>> 12 Class3 1
>>>> 12 Class4 3
>>>> 17 Class3 5
>>>> 17 Class4 2
>>>>
>>>> I simply want to sum the size classes by plot and create a new data frame with a size class called "Class34" or with the SizeClass variable removed. I actually do have other size classes that I want to leave alone, but combine 3 and 4, so if I could figure out how to do this by creating a new class, that would be preferable.
>>>>
>>>> I've also attached a more detailed sample of data.
>>>>
>>>> Thanks,
>>>> Chris Dolanc
>>>>
>>>> -- Christopher R. Dolanc
>>>> Post-doctoral Researcher
>>>> University of Montana and UC-Davis
>>>>
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
--
Christopher R. Dolanc
Post-doctoral Researcher
University of Montana and UC-Davis
More information about the R-help
mailing list