[R] how to correlate nominal variables?

Daniel Malter daniel at umd.edu
Mon Jul 27 20:49:05 CEST 2009


Benoit Vaillant made me aware of an indexing mistake in the computation of
Cramer's V. The col.sum indexes rows instead of columns. This is a
correction of the code:

cramers.v=function(x){
    x=as.data.frame(x)
    chisq=0
    row.sum=NULL
    col.sum=NULL
    row.sum=rowSums(table(x))
    col.sum=colSums(table(x))
    for(k in 1:dim(table(x))[1]){
      for(l in 1:dim(table(x))[2]){
         
chisq=chisq+((table(x)[k,l]-(row.sum[k]*col.sum[l])/(dim(x)[1]))^2)/((row.sum[k]*col.sum[l])/(dim(x)[1]))
          cramers.v=sqrt(chisq/(dim(x)[1]*(min(dim(table(x)))-1)))
      }
    }
  }


Daniel Malter wrote:
> 
> You can copy the code below to your R-code editor. For Yule's Q, the data
> is expected in two vectors. For cramer's phi, the data is expected in
> separate columns of a matrix or dataframe.
> 
> ##Run this code
> yule.Q=function(x,y){(table(x,y)[1,1]*table(x,y)[2,2]-table(x,y)[1,2]*table(x,y)[2,1])/(table(x,y)[1,1]*table(x,y)[2,2]+table(x,y)[1,2]*table(x,y)[2,1])}
> 
> ##create test data
> vector.one=rbinom(100,1,0.4)
> vector.two=rbinom(100,1,0.8)
> table(vector.one,vector.two)
> 
> ##compute yule's Q
> yule.Q(vector.one,vector.two)  
> ##just put your two vector names there
> 
> 
> 
> 
> ##Cramer's V
> 
> ##Run this code
> cramers.v=function(x){
>     x=as.data.frame(x)
>     chisq=0
>     row.sum=NULL
>     col.sum=NULL
>     for(i in 1:dim(table(x))[1])
>       row.sum[i]=sum(table(x)[i,])
>     for(j in 1:dim(table(x))[2])
>       col.sum[j]=sum(table(x)[j,])
>     for(k in 1:dim(table(x))[1]){
>       for(l in 1:dim(table(x))[2]){
>          
> chisq=chisq+((table(x)[k,l]-(row.sum[k]*col.sum[l])/(dim(x)[1]))^2)/((row.sum[k]*col.sum[l])/(dim(x)[1]))
>           cramers.v=sqrt(chisq/(dim(x)[1]*(min(dim(table(x)))-1)))
>       }
>     }
>   }
> 
> ##create test data
> toanalyze=cbind(rbinom(100,2,0.4),rbinom(100,1,0.6))
> toanalyze2=cbind(rep(c(0,1),each=50),rep(c(0,1),each=50))
> 
> ##compute cramer's v for the test data 
> v1=cramers.v(toanalyze) ## just put your dataframe or matrix name
> v2=cramers.v(toanalyze2)
> 
> v1 ##cramer's v
> v2 ##cramer's v
> 
> 
> 
> Timo Stolz wrote:
>> 
>> Dear R-Users,
>> 
>> I need functions to calculate Yule's Y or Cramérs Index, in order to
>> correlate variables that are nominally scaled?
>> 
>> Am I wrong? Are such functions existing?
>> 
>> Sincerely,
>> Timo
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/how-to-correlate-nominal-variables--tp18441195p24686228.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list