[R] dataframe calculations based on certain values of a column

Wed Mar 26 17:38:50 CET 2014

dplyr's group_by and mutate can create those columns for you:

var1 <- c("a","b","c","a","b","c","a","b","c")
var2 <- c("X","X","X","Y","Y","Y","Z","Z","Z")
var3 <- c(1,2,2,5,2,6,7,4,4)
df <- data.frame(var1,var2,var3)


dt <- tbl_df(df)

dt %.%
   group_by(var2) %.%
   mutate(
     div = var3[var1 == "c"],
     result_calc = var3/div
   )


On 2014-03-26 12:09, Johannes Radinger wrote:
> Hi,
> 
> I have data in a dataframe in following structure
> var1 <- c("a","b","c","a","b","c","a","b","c")
> var2 <- c("X","X","X","Y","Y","Y","Z","Z","Z")
> var3 <- c(1,2,2,5,2,6,7,4,4)
> df <- data.frame(var1,var2,var3)
> 
> Now I'd like to calculate relative values of var3. This values
> should be relative to the base value (where var1=c) which is
> indicated for each group (var2).
> 
> To illustrate how my result column should look like I divide
> the column var3 by a vector c(2,2,2,6,6,6,4,4,4) (= for each group
> of var2 the value c)
> 
> Of course this can also be done like this:
> df$div <- rep(df$var3[df$var1=="c"],each=length(unique(df$var1)))
> df$result_calc <- df$var3/df$div
> 
> 
> However what when the dataframe is not as simple and not that well 
> ordered
> as
> in the example here. So for example there is always a value c for each 
> group
> but all the "c"s are clumped in the last rows of the dataframe or 
> scatterd
> in a random
> mannar. Is there a simple way to still calculate such relative values.
> Probably with an approach using apply, but maybe someone can give me a 
> hint.
> Or do I need to sort my dataframe in order to do such calculations?
> 
> best,
> 
> /Johannes
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.