[R] How to read-in a transaction-table with single items per line via RODBC?

Elena Schulz elena.schulz at gmx.net
Wed Sep 3 23:02:09 CEST 2008


Hi,

I succeeded to read-in a transaction-table tr_dat with single items per
line via RODBC of the form:

Transact-ID     ItemID
1        item1
1        item2
1        item3
2        item2
2        item3
...

how do I create a transaction object of the arules package from such a
table?

I tried this with read.transactions(tr_dat, format="single") of the 
package arules but I didn'd succeed getting the following error
Error: invalid class "ngCMatrix" object: slot i is not *strictly* 
increasing inside a column.

After reading the R source of read.transactions I wrote the below 
function transactionsFromSingleItems, which seems to work. The problem 
seemed to be that the items of the list tr_basket are not atomic. That's 
why I used the following ugly construct: tr_basket = 
as.list(sapply(tr_basket, as.character)) as a newbee that I am.

But I think there must be an easier and more elegant way. Can anybody 
provide a better version? May be I missed a method of arules for this?
How to make a list with a single item atomic efficiently?

Thanks a lot,
-- Elena

transactionsFromSingleItems <-
function(data, cols = NULL, rm.duplicates = FALSE, sep = NULL)
{
     ## have lines with TransactionIDs and ItemIDs in the
     ## columns specified by 'cols'.
     if (!(is(cols, "numeric") && (length(cols) == 2)))
         stop("'cols' must be a numeric vector of length 2 for 'single'.")
     cols <- as(cols, "integer")

     ## groups the cols[2] regarding cols[1]
     tr_basket = split(data[cols[2]], data[cols[1]])
     if (rm.duplicates)
         tr_basket <- .rm.duplicates(tr_basket)
     # make tr_dat atomic by coercing tr_dat[[1]] to character
     tr_basket = as.list(sapply(tr_basket, as.character))
     ## creates transactions from entries
     as(tr_basket, "transactions")
}



More information about the R-help mailing list