[R] Help needed in data cleaning
Web Web
webweb8537 at gmail.com
Fri Dec 18 20:14:46 CET 2015
Hello,
I need some help in data cleaning using R. my CSV file looks as
follows.
"id","gender","age","category1","category2","category3","category4","category5","category6","category7","category8","category9","category10"1,"Male",22,"movies","music","travel","cloths","grocery",,,,,2,"Male",28,"travel","books","movies",,,,,,,3,"Female",27,"rent","fuel","grocery","cloths",,,,,,4,"Female",22,"rent","grocery","travel","movies","cloths",,,,,5,"Female",22,"rent","online-shopping","utiliy",,,,,,,
I need to reformat as follows.
id gender age category rank1 Male 22 movies
11 Male 22 music 21 Male 22 travel
31 Male 22 cloths 41 Male 22 grocery
51 Male 22 books NA1 Male 22 rent
NA1 Male 22 fuel NA1 Male 22 utility
NA1 Male 22 online-shopping NA
...................................5 Female 22 movies
NA5 Female 22 music NA5 Female 22 travel
NA5 Female 22 cloths NA5 Female 22 grocery
NA5 Female 22 books NA5 Female 22 rent
15 Female 22 fuel NA5 Female 22 utility
NA5 Female 22 online-shopping 2
So far My efforts are as follows.
mini <- read.csv("~/MS/coding/mini.csv", header=FALSE)
mini_clean <- mini[-1,]
df_mini <- melt(df_clean, id.vars=c("V1","V2","V3"))
sqldf('select * from df_mini order by "V1"')
Now I want to know what is the best way to fill all missing categories for
all users.
Thanks
Nash
[[alternative HTML version deleted]]
More information about the R-help
mailing list