[R] arules: rules are built on item ordering in the dataframe, rather than
Matt
mvgnyc at gmail.com
Sat Feb 28 15:05:07 CET 2009
Hi,
I'm trying out the package arules and I'm having a bit of trouble
getting my data to work properly. I have a set of transactions with
the purchased products but each product could appear in a different
column in the data frame. This causes the rules to be built based on
the ordering, which is not significant.
Here is an example:
# # Code:
my.df <- data.frame(
transaction=as.factor(1:4),
item1=c("a", "b", "c", "d"),
item2=c("e", "a", "f", "b"),
item3=c("h", "i", "b", "a"))
# Create transactions
library(arules)
my.trans <- as(my.df[,2:4], "transactions")
# Create Rules
rules <- apriori(my.trans, parameter=list(support=.01, confidence=0.6))
inspect(rules)
## End code
I'd like the confidence to be high for a -> b or b -> a (they appear
together in each transaction) regardless of *where* they appear.
This example gives the expected results:
## Working example:
my.df2 <- data.frame(
transaction=as.factor(1:4),
a = rep("a", 4),
b = rep("b", 4),
c = c(NA, "c", NA, NA),
d = c(NA, NA, "d", "d"))
my.trans2 <- as(my.df2[,2:5], "transactions")
rules2 <- apriori(my.trans2, parameter=list(support=.01, confidence=0.6))
inspect(rules2)
## End code
I can't figure out how to coerce my data frame into this format (or if
this is the best way to accomplish my objective).
I appreciate your help.
Thanks,
Matt
More information about the R-help
mailing list