[R] Having trouble converting a dataframe of character vectors to factors

William Dunlap wdunlap at tibco.com
Fri Feb 22 00:21:27 CET 2013


    #So I tried I tried this which had no effect
    keepcols<- grepl("Q1_",names(scs.c2))
    levels(scs.c2[,keepcols])<-list(NoResp="",NotImportant="not important",SomewhatImpt="somewhat   important",Important="important",VeryImpt="very important")
    #then this which also failed. It coerced a bunch of NA's and turned the vectors back to character vectors
    scs.c2[,keepcols]<-sapply(scs.c2[,keepcols],function(x) factor(x,levels(x)[c(NoResp="",NotImportant="not important",SomewhatImpt="somewhat important",Important="important",VeryImpt="very important")])

First, to make a factor variable with a given set of levels, use
    factor(x, levels=yourLevels)
(and not levels(x) <- yourLevels).

Also, to change a character vector to a factor with levels that are different than the
values in the character vector, I would use the levels and labels arguments to factor.  E.g.,
   > x <- c("i", "iii")
   > factor(x, levels=c("i","ii","iii"), labels=c("One","Two","Three"))
    [1] One   Three
    Levels: One Two Three

I haven't tried your complete example, but I would not use sapply() when producing
something you will want to convert to the columns of a data.frame.  Use lapply() instead.

I generally use 'for' loops to process the columns of a data.frame one at a time.
It is easy to understand, is quick enough, and may even reduce memory usage.  E.g.,
instead of
    keepcols <- grepl("Q1_", names(csc.c2))
    scs.c2[, keepCols] <- lapply(scs.c2[, keepCols], function(x)factor(x,levels=c(...)))
try
    for(iCol in grep("Q1_", names(scs.c2))) {
        scs.c2[, iCol] <- factor(scs.c2[, iCol], levels=c(...))
    }

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: Lopez, Dan [mailto:lopez235 at llnl.gov]
> Sent: Thursday, February 21, 2013 2:51 PM
> To: William Dunlap; Mark Lamias; R help (r-help at r-project.org)
> Subject: RE: [R] Having trouble converting a dataframe of character vectors to factors
> 
> Hi Bill,
> 
> Great info.
> 
> The problem is what was originally given to me looks like DPUT1 below (random sample
> of 25).
> 
> This is the only format they can give me this in and the data already looks molten. So I
> applied reshape2::dcast which resulted in a dataframe made of character vectors; except
> for the first column which is an integer vector.
> 
> So after dropping columns full of "" (blanks) and reordering columns I figured I needed
> factors to accomplish my goal (refer below) and converted everything to factors with:
> > x2[,-1]<-as.data.frame(lapply(x[,-1],as.factor))
> 
> and ended up with DPUT2 below (random sample of 25)
> 
> Now after reading your last email I figured I've done will since no attributes got dropped
> and no levels got dropped (just need to add some in because couldn't be derived from
> original dataframe) and column names seem fine.
> 
> Now I have a new problem which is how to reorder levels in a dataframe and possible
> add some unused. After seeing contents using Hmisc::contents I figured the next logical
> step is to handle like vectors a chunk at a time.
> For example subsetting to grepl("Q1_",names(scs.c2)) gives these vectors which all have
> identical levels except for one:
> $Q1_1 thru $Q1_7 except $Q1_3
> [1] ""                   "important"          "not important"      "somewhat important" "very
> important"
> $Q1_3
> [1] "important"          "not important"      "somewhat important" "very important"
> 
> #So I tried I tried this which had no effect
> keepcols<- grepl("Q1_",names(scs.c2))
> levels(scs.c2[,keepcols])<-list(NoResp="",NotImportant="not
> important",SomewhatImpt="somewhat
> important",Important="important",VeryImpt="very important")
> #then this which also failed. It coerced a bunch of NA's and turned the vectors back to
> character vectors
> scs.c2[,keepcols]<-sapply(scs.c2[,keepcols],function(x)
> factor(x,levels(x)[c(NoResp="",NotImportant="not
> important",SomewhatImpt="somewhat
> important",Important="important",VeryImpt="very important")])
> 
> Mind you I can easily do this in MS Excel and is probably what I am going to break down
> and do fairly soon. But I wanted to give this a good solid shot in R because I want to
> learn to handle these situations in R. I've been using R for almost a year.
> __________________________
> ADDITIONAL BACKGROUND
> 
> MY GOAL
> I ultimately want to get started with some basic correlation analysis for some of the
> columns : taking your example (slightly modified) I hope to be able to do this
> xx <- data.frame(stringsAsFactors=FALSE, check.names=FALSE,"No/Yes" =
> factor(c("Yes","No","No","No"), levels=c("No","Yes")),
> "Size" = ordered(c("Small","Large","Medium","Medium"),
> levels=c("Small","Medium","Large")),"Name" = c("Adam","Bill","Chuck","Larry"))
> > cor(sapply(xx[,1:2],as.numeric))
>         No/Yes       Size
> No/Yes  1.0000000 -0.8164966
> Size   -0.8164966  1.0000000
> 
> DPUT1
> structure(list(svaID = c(771L, 771L, 775L, 775L, 774L, 776L,
> 774L, 771L, 771L, 771L, 771L, 774L, 774L, 775L, 765L, 775L, 765L,
> 775L, 771L, 777L, 775L, 771L, 774L, 776L, 776L), question = structure(c(19L,
> 12L, 23L, 3L, 10L, 36L, 25L, 1L, 30L, 7L, 21L, 13L, 16L, 32L,
> 6L, 5L, 18L, 19L, 14L, 2L, 2L, 9L, 37L, 28L, 24L), .Label = c("Q1",
> "Q1_1", "Q1_2", "Q1_3", "Q1_4", "Q1_5", "Q1_6", "Q1_7", "Q10",
> "Q11", "Q12", "Q13", "Q14", "Q15", "Q16", "Q17", "Q17_1", "Q17_2",
> "Q17_3", "Q17_4", "Q17_5", "Q18", "Q19", "Q2", "Q20", "Q3", "Q4",
> "Q5", "Q6", "Q6_A_1", "Q6_A_2", "Q6_A_3", "Q6_A_4", "Q6_A_5",
> "Q7", "Q8", "Q9"), class = "factor"), answer = structure(c(11L,
> 29L, 29L, 26L, 29L, 29L, 1L, 1L, 1L, 13L, 11L, 1L, 1L, 1L, 26L,
> 26L, 11L, 11L, 29L, 13L, 13L, 29L, 29L, 29L, 27L), .Label = c("",
> "1", "2", "3", "4", "5", "Change of College/University", "Change of Field of Study",
> "Confirmed Field of Study", "did not meet expectations", "exceeded expectations",
> "Family/Friend", "important", "Live Locally", "LLNL Contact",
> "LLNL Housing page", "Local Newspaper", "met expectations", "no",
> "None", "Not at All", "not important", "Pursue an Advanced Degree",
> "Somewhat", "somewhat important", "very important", "Very Much",
> "Web", "yes"), class = "factor")), .Names = c("svaID", "question",
> "answer"), row.names = c(68L, 62L, 147L, 113L, 97L, 168L, 111L,
> 45L, 51L, 43L, 70L, 100L, 108L, 127L, 5L, 115L, 30L, 142L, 64L,
> 186L, 112L, 59L, 95L, 160L, 157L), class = "data.frame")
> 
> DPUT2
> structure(list(svaID = c(765L, 771L, 774L, 775L, 776L, 777L,
> 778L, 779L, 782L, 783L, 786L, 788L, 789L, 790L, 791L, 793L, 794L,
> 795L, 797L, 801L, 803L, 804L, 805L, 807L, 808L), Q1_1 = structure(c(5L,
> 5L, 5L, 2L, 5L, 2L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 2L, 5L,
> 5L, 5L, 5L, 5L, 5L, 2L, 2L, 2L), .Label = c("", "important",
> "not important", "somewhat important", "very important"), class = "factor"),
>     Q1_2 = structure(c(2L, 5L, 2L, 5L, 2L, 4L, 3L, 5L, 4L, 2L,
>     2L, 5L, 2L, 3L, 5L, 2L, 2L, 5L, 5L, 5L, 5L, 2L, 1L, 2L, 3L
>     ), .Label = c("", "important", "not important", "somewhat important",
>     "very important"), class = "factor"), Q1_3 = structure(c(4L,
>     4L, 4L, 4L, 4L, 1L, 1L, 4L, 1L, 4L, 4L, 4L, 4L, 4L, 4L, 1L,
>     4L, 4L, 1L, 4L, 4L, 4L, 4L, 1L, 4L), .Label = c("important",
>     "not important", "somewhat important", "very important"), class = "factor"),
>     Q1_4 = structure(c(5L, 5L, 5L, 5L, 5L, 2L, 2L, 5L, 2L, 2L,
>     5L, 5L, 5L, 5L, 5L, 2L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L
>     ), .Label = c("", "important", "not important", "somewhat important",
>     "very important"), class = "factor"), Q1_5 = structure(c(5L,
>     3L, 5L, 5L, 3L, 2L, 2L, 3L, 2L, 3L, 5L, 5L, 5L, 4L, 4L, 5L,
>     5L, 5L, 3L, 3L, 3L, 5L, 2L, 2L, 4L), .Label = c("", "important",
>     "not important", "somewhat important", "very important"), class = "factor"),
>     Q1_6 = structure(c(5L, 2L, 2L, 2L, 5L, 2L, 4L, 5L, 4L, 5L,
>     5L, 5L, 5L, 5L, 5L, 2L, 5L, 2L, 4L, 2L, 4L, 5L, 2L, 4L, 4L
>     ), .Label = c("", "important", "not important", "somewhat important",
>     "very important"), class = "factor"), Q1_7 = structure(c(3L,
>     2L, 5L, 2L, 2L, 5L, 2L, 5L, 5L, 5L, 2L, 5L, 5L, 2L, 4L, 2L,
>     5L, 2L, 3L, 5L, 4L, 5L, 2L, 2L, 4L), .Label = c("", "important",
>     "not important", "somewhat important", "very important"), class = "factor"),
>     Q2 = structure(c(4L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 4L,
>     4L, 4L, 4L, 4L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 4L
>     ), .Label = c("", "Not at All", "Somewhat", "Very Much"), class = "factor"),
>     Q3 = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
>     2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
>     ), .Label = c("", "yes"), class = "factor"), Q4 = structure(c(4L,
>     5L, 6L, 4L, 5L, 5L, 5L, 4L, 5L, 4L, 4L, 4L, 4L, 6L, 3L, 5L,
>     4L, 4L, 5L, 5L, 4L, 4L, 5L, 5L, 4L), .Label = c("", "Change of College/University",
>     "Change of Field of Study", "Confirmed Field of Study", "None",
>     "Pursue an Advanced Degree"), class = "factor"), Q5 = structure(c(3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("", "no",
>     "yes"), class = "factor"), Q6 = structure(c(3L, 5L, 2L, 2L,
>     7L, 5L, 7L, 5L, 4L, 5L, 3L, 4L, 4L, 5L, 7L, 5L, 5L, 3L, 4L,
>     5L, 2L, 3L, 5L, 5L, 4L), .Label = c("", "Family/Friend",
>     "Live Locally", "LLNL Contact", "LLNL Housing page", "Local Newspaper",
>     "Web"), class = "factor"), Q6_A_1 = structure(c(1L, 1L, 1L,
>     6L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 1L,
>     1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "1", "2", "3",
>     "4", "5"), class = "factor"), Q6_A_2 = structure(c(1L, 1L,
>     1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>     1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "4", "5"), class = "factor"),
>     Q6_A_3 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>     1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>     1L), .Label = c("", "5"), class = "factor"), Q6_A_4 = structure(c(1L,
>     1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>     1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "5"), class = "factor"),
>     Q6_A_5 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>     1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>     3L), .Label = c("", "2", "3", "4", "5"), class = "factor"),
>     Q8 = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>     3L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L
>     ), .Label = c("", "no", "yes"), class = "factor"), Q9 = structure(c(3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 3L, 3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("", "no",
>     "yes"), class = "factor"), Q10 = structure(c(3L, 3L, 3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("", "no", "yes"), class = "factor"),
>     Q11 = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L
>     ), .Label = c("", "no", "yes"), class = "factor"), Q12 = structure(c(3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("", "no",
>     "yes"), class = "factor"), Q13 = structure(c(3L, 3L, 3L,
>     3L, 3L, 3L, 1L, 3L, 3L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>     1L, 3L, 1L, 1L, 3L, 1L, 3L), .Label = c("", "no", "yes"), class = "factor"),
>     Q14 = structure(c(3L, 1L, 1L, 3L, 2L, 3L, 1L, 3L, 3L, 1L,
>     1L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 3L, 3L, 1L, 2L
>     ), .Label = c("", "no", "yes"), class = "factor"), Q15 = structure(c(2L,
>     2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
>     2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("", "yes"
>     ), class = "factor"), Q16 = structure(c(4L, 4L, 4L, 3L, 3L,
>     3L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 3L, 4L, 3L,
>     3L, 3L, 4L, 3L, 4L), .Label = c("", "did not meet expectations",
>     "exceeded expectations", "met expectations"), class = "factor"),
>     Q17_1 = structure(c(3L, 4L, 4L, 3L, 3L, 3L, 4L, 4L, 4L, 3L,
>     3L, 3L, 3L, 4L, 2L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L
>     ), .Label = c("", "did not meet expectations", "exceeded expectations",
>     "met expectations"), class = "factor"), Q17_2 = structure(c(3L,
>     4L, 4L, 3L, 3L, 3L, 4L, 4L, 4L, 3L, 4L, 3L, 3L, 3L, 4L, 4L,
>     4L, 3L, 3L, 3L, 3L, 4L, 3L, 3L, 4L), .Label = c("", "did not meet expectations",
>     "exceeded expectations", "met expectations"), class = "factor"),
>     Q17_3 = structure(c(3L, 3L, 4L, 3L, 3L, 4L, 4L, 3L, 4L, 4L,
>     4L, 4L, 3L, 4L, 4L, 4L, 4L, 3L, 4L, 3L, 3L, 3L, 3L, 3L, 4L
>     ), .Label = c("", "did not meet expectations", "exceeded expectations",
>     "met expectations"), class = "factor"), Q17_4 = structure(c(4L,
>     4L, 4L, 3L, 2L, 3L, 4L, 3L, 3L, 3L, 4L, 4L, 3L, 4L, 3L, 4L,
>     4L, 3L, 2L, 4L, 3L, 3L, 3L, 3L, 4L), .Label = c("", "did not meet expectations",
>     "exceeded expectations", "met expectations"), class = "factor"),
>     Q17_5 = structure(c(3L, 3L, 4L, 3L, 4L, 4L, 4L, 3L, 4L, 4L,
>     4L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 4L, 4L, 3L, 4L, 4L, 3L
>     ), .Label = c("", "did not meet expectations", "exceeded expectations",
>     "met expectations"), class = "factor"), Q18 = structure(c(3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 3L, 3L, 3L, 3L, 3L, 3L,
>     3L, 3L, 3L, 3L, 1L, 3L, 3L, 3L, 3L), .Label = c("", "no",
>     "yes"), class = "factor"), Q19 = structure(c(3L, 3L, 3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("", "no", "yes"), class = "factor")), .Names =
> c("svaID",
> "Q1_1", "Q1_2", "Q1_3", "Q1_4", "Q1_5", "Q1_6", "Q1_7", "Q2",
> "Q3", "Q4", "Q5", "Q6", "Q6_A_1", "Q6_A_2", "Q6_A_3", "Q6_A_4",
> "Q6_A_5", "Q8", "Q9", "Q10", "Q11", "Q12", "Q13", "Q14", "Q15",
> "Q16", "Q17_1", "Q17_2", "Q17_3", "Q17_4", "Q17_5", "Q18", "Q19"
> ), row.names = c(NA, 25L), class = "data.frame")
> 
> Thanks.
> Dan
> 
> -----Original Message-----
> From: William Dunlap [mailto:wdunlap at tibco.com]
> Sent: Thursday, February 21, 2013 8:33 AM
> To: Mark Lamias; Lopez, Dan; R help (r-help at r-project.org)
> Subject: RE: [R] Having trouble converting a dataframe of character vectors to factors
> 
> > scs2<-data.frame(lapply(scs2, factor))
> 
> Calling data.frame() on the output of lapply() can result in changing column names and
> will drop attributes that the input data.frame may have had.  I prefer to modify the
> original data.frame instead of making a new one from scratch to avoid these problems.
> 
> Also, calling factor() on a factor will drop any unused levels, which you may not want to
> do.  Calling as.factor will not.
> 
> Compare the following three methods
> 
>   f1 <- function (dataFrame) {
>       dataFrame[] <- lapply(dataFrame, factor)
>       dataFrame
>   }
>   f2 <- function (dataFrame) {
>       dataFrame[] <- lapply(dataFrame, as.factor)
>       dataFrame
>   }
>   f3 <- function (dataFrame) {
>       data.frame(lapply(dataFrame, factor))
>   }
> 
> on the following data.frame
>   x <- data.frame(stringsAsFactors=FALSE, check.names=FALSE,
>                "No/Yes" = factor(c("Yes","Yes","Yes"), levels=c("No","Yes")),
>                "Size" = ordered(c("Small","Large","Medium"),
> levels=c("Small","Medium","Large")),
>                "Name" = c("Adam","Bill","Chuck"))
>   attr(x, "Date") <- as.POSIXlt("2013-02-21")
> 
> 
>   > str(x)
>   'data.frame':   3 obs. of  3 variables:
>    $ No/Yes: Factor w/ 2 levels "No","Yes": 2 2 2
>    $ Size  : Ord.factor w/ 3 levels "Small"<"Medium"<..: 1 3 2
>    $ Name  : chr  "Adam" "Bill" "Chuck"
>    - attr(*, "Date")= POSIXlt, format: "2013-02-21"
> 
>   > str(f1(x)) # drops unused levels
>   'data.frame':   3 obs. of  3 variables:
>    $ No/Yes: Factor w/ 1 level "Yes": 1 1 1
>    $ Size  : Ord.factor w/ 3 levels "Small"<"Medium"<..: 1 3 2
>    $ Name  : Factor w/ 3 levels "Adam","Bill",..: 1 2 3
>    - attr(*, "Date")= POSIXlt, format: "2013-02-21"
>   > str(f2(x))
>   'data.frame':   3 obs. of  3 variables:
>    $ No/Yes: Factor w/ 2 levels "No","Yes": 2 2 2
>    $ Size  : Ord.factor w/ 3 levels "Small"<"Medium"<..: 1 3 2
>    $ Name  : Factor w/ 3 levels "Adam","Bill",..: 1 2 3
>    - attr(*, "Date")= POSIXlt, format: "2013-02-21"
>   > str(f3(x)) # mangles column names, drops unused levels, drops Date attribute
>   'data.frame':   3 obs. of  3 variables:
>    $ No.Yes: Factor w/ 1 level "Yes": 1 1 1
>    $ Size  : Ord.factor w/ 3 levels "Small"<"Medium"<..: 1 3 2
>    $ Name  : Factor w/ 3 levels "Adam","Bill",..: 1 2 3
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
> > -----Original Message-----
> > From: r-help-bounces at r-project.org
> > [mailto:r-help-bounces at r-project.org] On Behalf Of Mark Lamias
> > Sent: Wednesday, February 20, 2013 6:51 PM
> > To: Daniel Lopez; R help (r-help at r-project.org)
> > Subject: Re: [R] Having trouble converting a dataframe of character
> > vectors to factors
> >
> > How about this?
> >
> > scs2<-data.frame(lapply(scs2, factor))
> >
> >
> >
> >
> > ________________________________
> >  From: "Lopez, Dan" <lopez235 at llnl.gov>
> > To: "R help (r-help at r-project.org)" <r-help at r-project.org>
> > Sent: Wednesday, February 20, 2013 7:09 PM
> > Subject: [R] Having trouble converting a dataframe of character
> > vectors to factors
> >
> > R Experts,
> >
> > I have a dataframe made up of character vectors--these are results
> > from survey questions. I need to convert them to factors.
> >
> > I tried the following which did not work:
> > scs2<-sapply(scs2,as.factor)
> > also this didn't work:
> > scs2<-sapply(scs2,function(x) as.factor(x))
> >
> > After doing either of above I end up with
> > >str(scs2)
> >
> > chr [1:10, 1:10] "very important" "very important" "very important" "very important" ...
> >
> > - attr(*, "dimnames")=List of 2
> >
> >   ..$ : NULL
> >
> >   ..$ : chr [1:10] "Q1_1" "Q1_2" "Q1_3" "Q1_4" ...
> >
> > >class(scs2)
> > "matrix"
> >
> > But when I do it one at a time it works:
> > scs2$Q1_1<-as.factor(scs2$Q1_1)
> > scs2$Q1_2<- as.factor(scs2$Q1_2)
> >
> > What am I doing wrong?  How do I accomplish this with sapply or similar function?
> >
> > Data for reproducibility:
> >
> >
> > scs2<-structure(list(Q1_1 = c("very important", "very important",
> > "very important",
> >
> > "very important", "very important", "very important", "very
> > important",
> >
> > "somewhat important", "important", "very important"), Q1_2 =
> > c("important",
> >
> > "somewhat important", "very important", "important", "important",
> >
> > "very important", "somewhat important", "somewhat important",
> >
> > "very important", "very important"), Q1_3 = c("very important",
> >
> > "important", "very important", "very important", "important",
> >
> > "very important", "very important", "somewhat important", "not
> > important",
> >
> > "important"), Q1_4 = c("very important", "important", "very
> > important",
> >
> > "very important", "important", "important", "important", "very
> > important",
> >
> > "somewhat important", "important"), Q1_5 = c("very important",
> >
> > "not important", "important", "very important", "not important",
> >
> > "important", "somewhat important", "important", "somewhat important",
> >
> > "not important"), Q1_6 = c("very important", "not important",
> >
> > "important", "very important", "somewhat important", "very important",
> >
> > "very important", "very important", "important", "important"),
> >
> >     Q1_7 = c("very important", "somewhat important", "important",
> >
> >     "somewhat important", "important", "important", "very important",
> >
> >     "very important", "somewhat important", "not important"),
> >
> >     Q2 = c("Somewhat", "Very Much", "Somewhat", "Very Much",
> >
> >     "Very Much", "Very Much", "Very Much", "Very Much", "Very Much",
> >
> >     "Very Much"), Q3 = c("yes", "yes", "yes", "yes", "yes", "yes",
> >
> >     "yes", "yes", "yes", "yes"), Q4 = c("None", "None", "None",
> >
> >     "None", "Confirmed Field of Study", "Confirmed Field of Study",
> >
> >     "Confirmed Field of Study", "None", "None", "None")), .Names =
> > c("Q1_1",
> >
> > "Q1_2", "Q1_3", "Q1_4", "Q1_5", "Q1_6", "Q1_7", "Q2", "Q3", "Q4"
> >
> > ), row.names = c(78L, 46L, 80L, 196L, 188L, 197L, 39L, 195L,
> >
> > 172L, 110L), class = "data.frame")
> >
> >
> >     [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > 	[[alternative HTML version deleted]]
> 



More information about the R-help mailing list