[R] Question on Counting Factors

S Ellison S.Ellison at LGCGroup.com
Wed Apr 11 16:11:08 CEST 2012


> I would like to get the frequency
> (counts) of all the variables in a single column (that is 
> easy), but I would also like to return the value 0 for the 
> absence of variables defined in another column.

If you use factor() on your columns and include all the animals in the factor levels, you should get what you want.

For example

animal.names <- sort(c("fish", "dog", "tiger", "cat"))
V1 <- sample(c('cat', 'tiger'), 10, replace=TRUE)
V1 <- factor(V1, levels=animal.names)
table(V1)

For your data frame, you can get animal.names from your existing data set directly rather than specify in advance. If they are all already factors (as they will be if you have used as.data.frame on a character matrix) you can get all the levels using rapply. Re-using factor will again get you what you're after:

animals <- matrix(c('cat','tiger','cat','tiger','fish','fish','dog','dog'),ncol=2, byrow=F)
animals <- as.data.frame(animals)

animal.names <- sort(rapply(animals, levels))
animals2 <- as.data.frame( lapply(animals, factor, levels=animal.names))
table(animals2$V1)

For extra safety, you might want to wrap the second factor() round an as.character:
animal.names <- sort(rapply(animals, levels))
animals2 <- as.data.frame( lapply(animals, function(x, l) factor(as.character(x), levels=l), l=animal.names))
 table(animals2$V1)

S Ellison
*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}



More information about the R-help mailing list