[R] Aggregation across two variables in data.table
Michael Haenlein
haenlein at escpeurope.eu
Thu Dec 14 08:48:29 CET 2017
Dear all,
I have a data.frame that includes a series of demographic variables for a
set of respondents plus a dependent variable (Theta). For example:
Age Education Marital Familysize
Income Housing Theta
1: 50 Associate degree Divorced 4
70K+ Owned with mortgage 9.147777
2: 65 Bachelor degree Married 1
10-15K Owned without mortgage 7.345036
3: 33 Bachelor degree Married 2
30-40K Owned with mortgage 7.974937
4: 69 Bachelor degree Never married 1
70K+ Owned with mortgage 7.733053
5: 54 Some college, less than college graduate Never married 3
30-40K Rented 7.648642
6: 35 Associate degree Separated 2
10-15K Rented 7.496411
My objective is to calculate the average of Theta across all pairs of two
demographics.
For 1 demographic this is straightforward:
Demo_names <- c("Age", "Education", "Marital", "Familysize", "Income",
"Housing")
means1 <- as.list(rep(0, length(Demo_names)))
for (i in 1:length(Demo_names)) {
Demo_tmp <- Demo_names[i]
means1[[i]] <- data_tmp[,list(mean(Theta)),by=Demo_tmp]}
Is there an easy way to extent this logic to more than 1 variable? I know
how to do this manually, e.g.,
data_tmp[,list(mean(Theta)),by=list(Marital, Education)]
But I don't know how to integrate this into a loop.
Thanks,
Michael
[[alternative HTML version deleted]]
More information about the R-help
mailing list