[R] crosstabulation and unlist function

eugen pircalabelu eugen_pircalabelu at yahoo.com
Tue Oct 13 09:07:53 CEST 2009


Thank you! This is what i needed!
I did not realize i could replicate my factor and then tabulate, but it makes sense!
Thank you once again!
  

 




----- Original Message ----
From: William Dunlap <wdunlap at tibco.com>
To: eugen pircalabelu <eugen_pircalabelu at yahoo.com>; r-help at r-project.org
Sent: Mon, October 12, 2009 10:48:36 PM
Subject: RE: [R] crosstabulation and unlist function

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of eugen pircalabelu
> Sent: Monday, October 12, 2009 1:06 PM
> To: David Winsemius
> Cc: R-help
> Subject: Re: [R] crosstabulation and unlist function
> 
> Hello,
> First of all, thank you David for your reply, but sadly this 
> is not what i wanted (i am sorry for not being more specific 
> about my problem!)
>    
>  aa<-c(1:5)
>  bb<-c(NA,2,NA,4,5)
>  cc<-c(1,2,NA,4,NA)
>  dd<-c("A","B","B","A","C")

You forget to say how you made 'df', which I assume is
   df <- data.frame(aa,bb,cc,dd)
Having a self-contained way to reproduce your problem
makes much easier to solve!

>  table(unlist(df[,1:3]))
> 
> > df
>   aa bb cc dd
> 1  1 NA  1  A
> 2  2  2  2  B
> 3  3 NA NA  B
> 4  4  4  4  A
> 5  5  5 NA  C
> 
> I do not want to get this:
> > tapply(apply(df[,1:3],1,sum, na.rm=TRUE), df$dd, sum)
> A  B  C
> 14  6 10
> 
> but a crosstabulation between  table(unlist(df[,1:3])) and 
> df$dd, which should look something like this:
> 
>     1   2   3   4  5
> A  2   0   0   3  0
> B  0   3   1   0  0
> C  0   0   0   0  2

Try

> with(df, table(rep(dd,3), c(aa,bb,cc)))

    1 2 3 4 5
  A 2 0 0 3 0
  B 0 3 1 0 0
  C 0 0 0 0 2
or
> table(rep(df$dd, 3), unlist(df[,1:3]))

    1 2 3 4 5
  A 2 0 0 3 0
  B 0 3 1 0 0
  C 0 0 0 0 2

You need the rep() to show how the 5 elements of dd
should correspond to the 15 elements of aa, bb, and cc.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> 
> meaning that when dd is A 1 appears 2 times, 2 doesn't 
> appear, 3 doesn't appear, 4 appears 3times, 5 doesn't appear; 
> when dd is C only 5 appears 2 times (i am not really 
> interested in the NA occurence).
> Hopefully, this time my question was a lot more clear.
> Thank you very much !
> 
>  
> 
>  
> 
> 
> 
> 
> ----- Original Message ----
> From: David Winsemius <dwinsemius at comcast.net>
> To: David Winsemius <dwinsemius at comcast.net>
> Cc: eugen pircalabelu <eugen_pircalabelu at yahoo.com>; R-help 
> <r-help at stat.math.ethz.ch>
> Sent: Mon, October 12, 2009 9:36:39 PM
> Subject: Re: [R] crosstabulation and unlist function
> 
> 
> On Oct 12, 2009, at 3:25 PM, David Winsemius wrote:
> 
> > 
> > On Oct 12, 2009, at 2:36 PM, eugen pircalabelu wrote:
> > 
> >> Hello R-users,
> >> 
> >> My toy example:
> >> aa<-c(1:5)
> >> bb<-c(NA,2,NA,4,5)
> >> cc<-c(1,2,NA,4,NA)
> >> dd<-c("A","B","B","A","C")
> >> df<-data.frame(aa,bb,cc,dd=as.factor(dd))
> >> table(unlist(df[,1:3]))
> >> 
> >> Can anyone point me to what function let's me do a 
> crosstabulation between   table(unlist(df[,1:3])) and df$dd?
> >> I want to find out when dd==A (or B, or C) how many times 
> do the values 1, 2 ,3,..  appear in df[,1:3]?
> >> Thank you very much!
> > 
> > One way would be to collect the row sums of those columns 
> first, and then sum by index:
> > 
> > tapply(apply(df[,1:3],1,sum, na.rm=TRUE), df$dd, sum)
> > A  B  C
> > 14  9 10
> 
> This method is safer than working on table(unlist(df[, 1:3]) 
> since it does not "break" when an entire row is empty.
> 
> > aa<-c(1,2,NA,4,5)
> > bb<-c(NA,2,NA,4,5)
> > cc<-c(1,2,NA,4,NA)
> > dd<-c("A","B","B","A","C")
> > df<-data.frame(aa,bb,cc,dd=as.factor(dd))
> > table(unlist(df[,1:3]))
> 
> 1 2 4 5
> 2 3 3 2     # missing row willno longer be aligned with "dd".
> > tapply(table(unlist(df[,1:3])), df$dd, sum)
> Error in tapply(table(unlist(df[, 1:3])), df$dd, sum) :
>   arguments must have same length
> 
> > tapply(apply(df[,1:3],1,sum, na.rm=TRUE), df$dd, sum)
> A  B  C
> 14  6 10
> 
> 
> > 
> > --
> > David Winsemius, MD
> > Heritage Laboratories
> > West Hartford, CT
> > 
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list