[R] Help deciding on data format for sales data (newbie)
jonathanbriggs
jonathanbriggs at mac.com
Tue Jun 2 17:46:38 CEST 2009
Dear All
Beginning data mining and need some help working out the best way to
represent data. I have searched here and online and not found any real help.
Imagines that I have a file of order(sales) data
OrderNo CustomerNo ItemsInOrder
1 1 a,b,c
2 1 d
3 2 a,d
I can represent this as a data.frame but then need to parse my ItemsInOrder?
This seems quite clumsy. Alternatively I can try this sort of representation
OrderNo CustomerNo a b c d
1 1 1 1 1 NA
2 1 NA NA NA 1
3 2 1 NA NA 1
Are these really the two choices and how well does the second representation
scale? (I have 50,000 SKUs)
Can anyone point me in the direction of some worked examples that take such
data and manipulate it; looking for association rules and clusters?
Thanks
Jonathan
--
View this message in context: http://www.nabble.com/Help-deciding-on-data-format-for-sales-data-%28newbie%29-tp23835331p23835331.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list