[R] How to convert a factor column into a numeric one?
Dennis Murphy
djmuser at gmail.com
Sun Jun 5 06:49:59 CEST 2011
Hi:
Try this:
> dd <- data.frame(a = factor(rep(1:5, each = 4)),
+ b = factor(rep(rep(1:2, each = 2), 5)),
+ y = rnorm(20))
> str(dd)
'data.frame': 20 obs. of 3 variables:
$ a: Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ...
$ b: Factor w/ 2 levels "1","2": 1 1 2 2 1 1 2 2 1 1 ...
$ y: num 0.6396 1.467 1.8403 -0.0915 0.2711 ...
> de <- within(dd, {
+ a <- as.numeric(as.character(a))
+ b <- as.numeric(as.character(b))
+ } )
> str(de)
'data.frame': 20 obs. of 3 variables:
$ a: num 1 1 1 1 2 2 2 2 3 3 ...
$ b: num 1 1 2 2 1 1 2 2 1 1 ...
$ y: num 0.6396 1.467 1.8403 -0.0915 0.2711 ...
HTH,
Dennis
On Sat, Jun 4, 2011 at 9:31 PM, Robert A. LaBudde <ral at lcfltd.com> wrote:
> I have a data frame:
>
>> head(df)
> Time Temp Conc Repl Log10
> 1 0 -20 H 1 6.406547
> 2 2 -20 H 1 5.738683
> 3 7 -20 H 1 5.796394
> 4 14 -20 H 1 4.413691
> 5 0 4 H 1 6.406547
> 7 7 4 H 1 5.705433
>> str(df)
> 'data.frame': 177 obs. of 5 variables:
> $ Time : Factor w/ 4 levels "0","2","7","14": 1 2 3 4 1 3 4 1 3 4 ...
> $ Temp : Factor w/ 4 levels "-20","4","25",..: 1 1 1 1 2 2 2 3 3 3 ...
> $ Conc : Factor w/ 3 levels "H","L","M": 1 1 1 1 1 1 1 1 1 1 ...
> $ Repl : Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
> $ Log10: num 6.41 5.74 5.8 4.41 6.41 ...
>> levels(df$Temp)
> [1] "-20" "4" "25" "45"
>> levels(df$Time)
> [1] "0" "2" "7" "14"
>
> As you can see, "Time" and "Temp" are currently factors, not numeric.
>
> I would like to change these columns into numerical ones.
>
> df$Time<- as.numeric(df$Time)
>
> doesn't work, as it changes to the factor level indices (1,2,3,4) instead of
> the values (0,2,7,14).
>
> There must be a direct way of doing this in R.
>
> I tried recode() in 'car':
>
>> df$Temp<- recode(df$Temp, '1=-20;2=25;3=4;4=45',as.factor.result=FALSE)
>> head(df)
> Time Temp Conc Repl Freq
> 1 0 -20 H 1 6.406547
> 2 2 -20 H 1 5.738683
> 3 7 -20 H 1 5.796394
> 4 14 -20 H 1 4.413691
> 5 0 45 H 1 6.406547
> 7 7 45 H 1 5.705433
>
> but note that the values for 'Temp' in rows 5 and 7 are 45 and not 4, as
> expected, although the result is numeric. The same happens if I use the
> order given by levels(df$Temp) instead of the sort order in the recode() 2nd
> argument.
>
> Any hints?
> ================================================================
> Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: ral at lcfltd.com
> Least Cost Formulations, Ltd. URL: http://lcfltd.com/
> 824 Timberlake Drive Tel: 757-467-0954
> Virginia Beach, VA 23464-3239 Fax: 757-467-2947
>
> "Vere scire est per causas scire"
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list