[R] FW: averaging two tables (rows with columns)
Rui Barradas
ruipbarradas at sapo.pt
Thu May 10 20:53:06 CEST 2012
Hello,
After all the trouble, workable datasets.
I have a doubt on what you want.
By looking at the two results data.frames, I don't believe they match the
problem description.
The average columns are wrong. Look at line 1 in table3a. I has speciesXX
with a value of 0.14
but speciesXX does NOT occur in table2.
First Error ---> You are using the value of SpeciesXY in table2.
And even if speciesXX and SpeciesXY were the same, the value in table1,
Plot1 is zero.
Second Error ---> You are counting values that are not in the plots.
If I'm right, and only in that case, try the following
fun <- function(df1, df2){
y <- df2[, -1]
y <- t(y)
colnames(y) <- tolower(df2[, 1])
names(df1) <- tolower(names(df1))
cnames <- intersect(names(df1), colnames(y))
x <- as.matrix(df1[, cnames])
y <- as.matrix(y[, cnames])
count <- apply(x, 1, sum, na.rm=TRUE)
res <- (x %*% t(y))/count
rownames(res) <- df1[, 1]
res
}
fun(table1, table2)
Now, correct the first error, assuming it's the species name in table1
that's wrong.
tb1 <- table1
colnames(tb1)[5] <- "speciesXY"
tb1
fun(tb1, table2)
And correct the second, recording speciesXY (or XX) as 1 in Plot1.
tb1$speciesXY[1] <- 1
fun(tb1, table2)
# output
EnviA EnviB EnviC
Plot1 0.175 0.28 0.18
Plot2 0.100 0.15 0.18
plot3 0.175 0.28 0.18
plot4 NaN NaN NaN
Finally, as you can see, the output of fun() is in a different format.
It is possible to change that (obvious) but not worth the trouble if I'm
wrong.
Give some feedback.
Hope this helps,
Rui Barradas
Kristi Glover wrote
>
> oppps, Now I used 'dput' function. Again I am sending. I am so sorry for
> inconvenience.
> HI R userI am sorry that my data was not readable formate in the last
> email. Agin I am trying to send it. hope this time, that table can be
> readable.As I mentioned earlier that I was struggling to figure out on how
> I can calculate the average from the two tables in R. Any one can help me?
> really your helpwould be grateful- I am spending so much time to figure it
> out. It should not be so hard, I think.I have very big data but I have
> created a hypothetical data for simplification.for exampleI have : table
> 1Table 1: species occurrence data> dput(table1)structure(list(X =
> structure(1:4, .Label = c("Plot1", "Plot2", "plot3", "plot4"), class =
> "factor"), speciesX = c(1L, 0L, 1L, 0L), speciesY = c(0L, 1L, 0L, 0L),
> speciesZ = c(1L, 1L, 0L, 1L), speciesXX = c(0L, 0L, 1L, 0L)), .Names =
> c("X", "speciesX", "speciesY", "speciesZ", "speciesXX"), class =
> "data.frame", row.names = c(NA, -4L))
> Table 2: table 2. species tolerance data> dput(table2)structure(list(X =
> structure(c(1L, 3L, 2L), .Label = c("SpeciesX", "SpeciesXY", "SpeciesY"),
> class = "factor"), EnviA = c(0.21, 0.1, 0.14), EnviB = c(0.4, 0.15, 0.16),
> EnviC = c(0.17, 0.18, 0.19)), .Names = c("X", "EnviA", "EnviB", "EnviC"),
> class = "data.frame", row.names = c(NA, -3L))> You may noticed that table
> 2 does not have species Z which was in tableTable 3: Now I want to get the
> average value of species tolerance in each plot based on each
> environmental value (EnviA or EnviB etc).The example of the out come
> (final table I was looking for it).Results table 3a: averages species
> tolerance in each plot based on EnviAsuch as: >
> dput(table3a)structure(list(X = structure(1:4, .Label = c("plot1",
> "plot2", "plot3", "plot4"), class = "factor"), speciesX = c(0.21, NA, NA,
> 0.21), speciesY = c(NA, 0.1, NA, NA), speciesZ = structure(c(1L, 1L, 1L,
> 1L), .Label = "Nodata", class = "factor"), speciesXX = c(0.14, NA, 0.14,
> NA), av!
> erage = c(0.175, 0.1, 0.14, 0.21)), .Names = c("X", "speciesX",
> "speciesY", "speciesZ", "speciesXX", "average"), class = "data.frame",
> row.names = c(NA, -4L))
> Table 3b
> Result table 3b: average species tolerance in plot based on EnviB>
> dput(table3b) structure(list(X = structure(1:4, .Label = c("plot1",
> "plot2", "plot3", "plot4"), class = "factor"), speciesX = c(0.4, NA, NA,
> 0.4), speciesY = c(NA, 0.15, NA, NA), speciesZ = structure(c(1L, 1L, 1L,
> 1L), .Label = "Nodata", class = "factor"), speciesXX = c(0.16, NA, 0.16,
> NA), average = c(0.28, 0.15, 0.16, 0.4)), .Names = c("X", "speciesX",
> "speciesY", "speciesZ", "speciesXX", "average"), class = "data.frame",
> row.names = c(NA, -4L))I hope this time the data would be readable
> formate. Would any one help me how I can calculate these?ThanksKristi
> Golver==
>
>
> again
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@ mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
View this message in context: http://r.789695.n4.nabble.com/averaging-two-tables-rows-with-columns-tp4623845p4624300.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list