[R] Data transformation & cleaning
anopheles123 at gmail.com
Wed Sep 28 09:36:55 CEST 2011
Seems your questions belong to rule mining for frequent item sets.
check arules package
On Tue, Sep 27, 2011 at 11:13 PM, pip56789 <pde3p at virginia.edu> wrote:
> I have a few methodological and implementation questions for ya'll. Thank
> you in advance for your help. I have a dataset that reflects people's
> preference choices. I want to see if there's any kind of clustering effect
> among certain preference choices (e.g. do people who pick choice A also pick
> choice D).
> I have a data set that has one record per user ID, per preference choice.
> It's a "long" form of a data set that looks like this:
> ID | Page
> 123 | Choice A
> 123 | Choice B
> 456 | Choice A
> 456 | Choice B
> I thought that I should do the following
> 1. Make the data set "wide", counting the observations so the data looks
> like this:
> ID | Count of Preference A | Count of Preference B
> 123 | 1 | 1
> table1 <- dcast(data,ID ~ Page,fun.aggregate=length,value_var='Page' )
> 2. Create a correlation matrix of preferences
> How would I restrict my correlation to show preferences that met a minimum
> sample threshold? Can you confirm if the two following commands do the same
> thing? What would I do from here (or am I taking the wrong approach)
> table1 <- dcast(data,Page ~ Page,fun.aggregate=length,value_var='Page' )
> table2 <- with(data, table(Page,Page))
> many thanks,
> View this message in context: http://r.789695.n4.nabble.com/Data-transformation-cleaning-tp3849889p3849889.html
> Sent from the R help mailing list archive at Nabble.com.
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help