[R] Working with tables with missing levels
    Andre Nathan 
    andre at digirati.com.br
       
    Mon Jul 27 21:21:53 CEST 2009
    
    
  
Hello
I'm trying to write a function to calculate the relative entropy between
two distributions. The data I have is in table format, for example:
> t1 <- prop.table(table(c(0,0,2,4,4)))
> t2 <- prop.table(table(c(0,2,2,2,3)))
> t1
  0   2   4 
0.4 0.2 0.4 
> t2
  0   2   3 
0.2 0.6 0.2
The relative entropy is given by
  H[P||Q] = sum(p * log2(p/q))
with the conventions that 0*log2(0/q) = 0 and p*log2(p/0) = Inf.
I'm not sure about what is the best way to achieve that. Is there a way
to test if a table has a value for a given level, so that I can detect
that, for example, t1 is missing levels 1 and 3 and t2 is missing levels
1 and 4 (is "level" the correct terminology here?)? Simply trying to
access t1[["1"]], for example, gives a "subscript out of bounds" error.
Another option would be to "expand" the tables, so that, for example, t1
becomes
  0   1   2   3   4 
0.4 0.0 0.2 0.0 0.4
Is there a way to do that?
Thanks,
Andre
    
    
More information about the R-help
mailing list