[R] turning comma separated string from multiple choices into flags
juneaftn at gmail.com
Mon Sep 29 17:12:34 CEST 2008
Thank you. The misspelling of Harvard wasn't intended. The data are
2008/9/30 Peter Dalgaard <P.Dalgaard at biostat.ku.dk>:
> June Kim wrote:
>> I use google docs' Forms to conduct surveys online. Multiple choices
>> questions are coded as comma separated values.
>> For example,
>> if the question is like:
>> 1. What magazines do you currently subscribe to? (you can choose
>> multiple choices)
>> 1) Fast Company
>> 2) Havard Business Review
>> 3) Business Week
>> 4) The Economist
>> And if the subject chose 1) and 3), the data is coded as a cell in a
>> spreadsheet as,
>> "Fast Company, Business Week"
>> I read the data with read.csv into R. To analyze the data, I have to
>> change that string into something like flags(indicator variables?).
>> That is, there should be 4 variables, of which values are either 1 or
>> 0, indicating chosen or not-chosen respectively.
>> Suppose the data is something like,
>> age favorite_magazine
>> 1 29 Fast Company
>> 2 31 Fast Company, Business Week
>> 3 32 Havard Business Review, Business Week, The Economist
>> Then I have to chop the string in favorite_magazine column to turn
>> that data into something like,
>> age Fast Company Havard Business Review Business Week The Economist
>> 1 29 1 0 0 0
>> 2 31 1 0 1 0
>> 3 32 0 1 1 1
>> Actually I have many more multiple choice questions in the survey.
>> What is the easy elegant and natural way in R to do the job?
> I'd look into something like as.data.frame(lapply(strings, grep,
> x=favorite_magazine, fixed=TRUE)), where strings <- c("Fast Company",
> "Havard Business Review", ...).
> (I take it that the mechanism is such that you can rely on at least
> having everything misspelled in the same way? If it is alternatingly
> "Havard" and "Harvard", then things get a bit trickier.)
> O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
> (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help