[R] Working with tables with missing levels
Andre Nathan
andre at digirati.com.br
Mon Jul 27 21:21:53 CEST 2009
Hello
I'm trying to write a function to calculate the relative entropy between
two distributions. The data I have is in table format, for example:
> t1 <- prop.table(table(c(0,0,2,4,4)))
> t2 <- prop.table(table(c(0,2,2,2,3)))
> t1
0 2 4
0.4 0.2 0.4
> t2
0 2 3
0.2 0.6 0.2
The relative entropy is given by
H[P||Q] = sum(p * log2(p/q))
with the conventions that 0*log2(0/q) = 0 and p*log2(p/0) = Inf.
I'm not sure about what is the best way to achieve that. Is there a way
to test if a table has a value for a given level, so that I can detect
that, for example, t1 is missing levels 1 and 3 and t2 is missing levels
1 and 4 (is "level" the correct terminology here?)? Simply trying to
access t1[["1"]], for example, gives a "subscript out of bounds" error.
Another option would be to "expand" the tables, so that, for example, t1
becomes
0 1 2 3 4
0.4 0.0 0.2 0.0 0.4
Is there a way to do that?
Thanks,
Andre
More information about the R-help
mailing list