[R] How to read-in a transaction-table with single items per line via RODBC?
Elena Schulz
elena.schulz at gmx.net
Wed Sep 3 23:02:09 CEST 2008
Hi,
I succeeded to read-in a transaction-table tr_dat with single items per
line via RODBC of the form:
Transact-ID ItemID
1 item1
1 item2
1 item3
2 item2
2 item3
...
how do I create a transaction object of the arules package from such a
table?
I tried this with read.transactions(tr_dat, format="single") of the
package arules but I didn'd succeed getting the following error
Error: invalid class "ngCMatrix" object: slot i is not *strictly*
increasing inside a column.
After reading the R source of read.transactions I wrote the below
function transactionsFromSingleItems, which seems to work. The problem
seemed to be that the items of the list tr_basket are not atomic. That's
why I used the following ugly construct: tr_basket =
as.list(sapply(tr_basket, as.character)) as a newbee that I am.
But I think there must be an easier and more elegant way. Can anybody
provide a better version? May be I missed a method of arules for this?
How to make a list with a single item atomic efficiently?
Thanks a lot,
-- Elena
transactionsFromSingleItems <-
function(data, cols = NULL, rm.duplicates = FALSE, sep = NULL)
{
## have lines with TransactionIDs and ItemIDs in the
## columns specified by 'cols'.
if (!(is(cols, "numeric") && (length(cols) == 2)))
stop("'cols' must be a numeric vector of length 2 for 'single'.")
cols <- as(cols, "integer")
## groups the cols[2] regarding cols[1]
tr_basket = split(data[cols[2]], data[cols[1]])
if (rm.duplicates)
tr_basket <- .rm.duplicates(tr_basket)
# make tr_dat atomic by coercing tr_dat[[1]] to character
tr_basket = as.list(sapply(tr_basket, as.character))
## creates transactions from entries
as(tr_basket, "transactions")
}
More information about the R-help
mailing list