[R] Howto build combinations of colums of a data frame

baptiste auguie ba208 at exeter.ac.uk
Thu Apr 16 18:41:54 CEST 2009


Perhaps,

apply(combn(letters[1:4],2), 2, paste,collapse="")

Hope this helps,

baptiste
On 16 Apr 2009, at 17:33, Juergen Rose wrote:

> Am Donnerstag, den 16.04.2009, 10:59 -0400 schrieb David Winsemius:
>
> Thanks David,
>
> is there also a shorter way to get the columns names of the new data
> frames?
>
> Juergen
>
>> On Apr 16, 2009, at 10:14 AM, Juergen Rose wrote:
>>
>>> Hi,
>>>
>>> as a R-newcomer I would like to create some new data frames from a
>>> given
>>> data frame. The first new data frame should content all pairs of the
>>> columns of the original data frame. The second new data frame should
>>> content all tripels of of the columns of the original data frame and
>>> the
>>> last the quadrupel of columns. The values in the new data frames
>>> should
>>> be the product of two, three our four original single field values.
>>> For
>>> pairs and tripels I could realize that task, with the following R
>>> script:
>>>
>>> Lines <- "a    b    c    d
>>>   13     0    15   16
>>>   23    24    25    0
>>>   33    34     0   36
>>>    0    44    45   46
>>>   53    54     0   55"
>>>
>>> DF <- read.table(textConnection(Lines), header = TRUE)
>>>
>>> nrow <-length(rownames(DF))
>>> cnames <- colnames(DF)
>>> nc <-length(DF)
>>>
>>> nc.pairs <- nc*(nc-1)/2
>>> #  initialize vector
>>> cnames.new <- c(rep("",nc.pairs))
>>> ind <- 1
>>> print(sprintf("nc=%d",nc))
>>> for (i in 1:(nc-1)) {
>>> if (i+1 <= nc ) {
>>>   for (j in (i+1):nc) {
>>>     cnames.new[ind] <- paste(cnames[i],cnames[j],sep="")
>>>     ind <- ind+1
>>>   }
>>> }
>>> }
>>>
>>> ind <- 1
>>> #  initialize data.frame
>>> pairs <- data.frame(matrix(c(rep(0,nc.pairs*nrow)),ncol=nc.pairs))
>>> for (i in 1:nc) {
>>> if (i+1 <= nc ) {
>>>   for (j in (i+1):nc) {
>>>     t <- DF[,i] * DF[,j]
>>>     pairs[[ind]] <- t
>>>     ind <- ind+1
>>>   }
>>> }
>>> }
>>> colnames(pairs) <- cnames.new
>>> print("pairs=");   print(pairs)
>>
>> apply(combn(colnames(DF),2), 2, function(x) DF[,x[1]]*DF[,x[2]] )
>>      [,1] [,2] [,3] [,4] [,5] [,6]
>> [1,]    0  195  208    0    0  240
>> [2,]  552  575    0  600    0    0
>> [3,] 1122    0 1188    0 1224    0
>> [4,]    0    0    0 1980 2024 2070
>> [5,] 2862    0 2915    0 2970    0
>>>
>>>
>>> nc.tripels <- nc*(nc-1)*(nc-2)/6
>>> #  initialize vector
>>> cnames.new <- c(rep("",nc.tripels))
>>> ind <- 1
>>> print(sprintf("nc=%d",nc))
>>> for (i in 1:nc) {
>>> if (i+1 <= nc ) {
>>>   for (j in (i+1):nc) {
>>>     if (j+1 <= nc ) {
>>>       for (k in (j+1):nc) {
>>>         cnames.new[ind] <-
>>> paste(cnames[i],cnames[j],cnames[k],sep="")
>>>         ind <- ind+1
>>>       }
>>>     }
>>>   }
>>> }
>>> }
>>>
>>> ind <- 1
>>> #  initialize data.frame
>>> tripels <-
>>> data.frame(matrix(c(rep(0,nc.tripels*nrow)),ncol=nc.tripels))
>>> for (i in 1:(nc-1)) {
>>> if (i+1 <= nc ) {
>>>   for (j in (i+1):nc) {
>>>     if (j+1 <= nc ) {
>>>       for (k in (j+1):nc) {
>>>         t <- DF[,i] * DF[,j] * DF[,k]
>>>         tripels[[ind]] <- t
>>>         ind <- ind+1
>>>       }
>>>     }
>>>   }
>>> }
>>> }
>>> colnames(tripels) <-  cnames.new
>>> print("tripels=");   print(tripels)
>>
>>> apply(combn(colnames(DF),3), 2, function(x)
>> DF[,x[1]]*DF[,x[2]]*DF[,x[3]])
>>       [,1]   [,2] [,3]  [,4]
>> [1,]     0      0 3120     0
>> [2,] 13800      0    0     0
>> [3,]     0  40392    0     0
>> [4,]     0      0    0 91080
>> [5,]     0 157410    0     0
>>
>>>
>>>
>>> I suppose that here is a much shorter way to get the same results.  
>>> Any
>>> hint is very much appreciated.
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

_____________________________

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag




More information about the R-help mailing list