[R] Omitting NA's using dcast (reshape2 package)

Michael.Laviolette at dhhs.state.nh.us Michael.Laviolette at dhhs.state.nh.us
Mon Jun 22 15:46:42 CEST 2015


I'm using the "dcast" function from Hadley's "reshape2" package to do some
tabulations. I can't get it to exclude NA's in the variables being
tabulated. Here's a simple example.

v1 <- c(rep("A", 5), rep("B", 5), NA)
v2 <- c("X", "Y", "Y", "Z", "Z", "X", "Y", "Y", "Z", NA, "Z")
v3 <- c(rep("a", 4), "c", "a", "b", NA, "c", "b", "c")
df <- data.frame(v1, v2, v3)
rm(v1, v2, v3)

library(reshape2)
dcast(df, v1 ~ v2, length, margins = TRUE)

#      v1 X Y Z NA (all)
# 1     A 1 2 2  0     5
# 2     B 1 2 1  1     5
# 3  <NA> 0 0 1  0     1
# 4 (all) 2 4 4  1    11
# "drop" argument has no effect
# na.omit will skip all records with any missing value

What I want is this:

#      v1 X Y Z (all)
# 1     A 1 2 2     5
# 2     B 1 2 1     4
# 3 (all) 2 4 3     9

Does anyone have any ideas?
Thanks,
Mike L.



More information about the R-help mailing list