[R] convert Factor as numeric

arnaud Gaboury arnaud.gaboury at gmail.com
Thu Apr 29 13:12:44 CEST 2010


Dear group,

I know this issue has been already covered, and before you reply I must say
I have read the R-FAQ and search the mailing list archive.
I still can't manage to change my factor to numeric as I couldn't find any
clear answer.

Here is my df :

Pose1 <-
structure(list(DESCRIPTION = structure(c(1L, 2L, 3L, 4L, 5L, 
8L), .Label = c(" SUGAR NO.11 May/10 ", "COTTON NO.2 May/10 ", 
"PLATINUM Jul/10 ", "ROBUSTA COFFEE (10) May/10 ", "WHEAT May/10 ", 
"PRIMARY NICKEL USD", "PRM HGH GD ALUMINIUM USD", "SPCL HIGH GRADE ZINC
USD", 
"STANDARD LEAD USD"), class = "factor"), POSITION = c(5, 3, -1, 
15, 4, 2), SETTLEMENT = structure(c(3L, 5L, 2L, 1L, 4L, 8L), .Label =
c("1,353.0000", 
"1,739.4000", "16.5400", "467.7500", "78.1300", "25,760.8600", 
"2,415.9000", "2,421.0500", "2,357.1200"), class = "factor")), .Names =
c("DESCRIPTION", 
"POSITION", "SETTLEMENT"), row.names = c("1", "2", "3", "4", 
"5", "51"), class = "data.frame")

>S<-Pose1$SETTLEMENT  #select the last column
> S
[1] 16.5400    78.1300    1,739.4000 1,353.0000 467.7500   2,421.0500
Levels: 1,353.0000 1,739.4000 16.5400 467.7500 78.1300 25,760.8600
2,415.9000 2,421.0500 2,357.1200
> str(S)
 Factor w/ 9 levels "1,353.0000","1,739.4000",..: 3 5 2 1 4 8

Now I need to change S to numeric class

> S1<-as.numeric(levels(S))[as.integer(S)]   #doesn't work, numbers are
rounded or NA
Warning message:
NAs introduced by coercion

> S1<-as.numeric(levels(S))[S]  #doesn't work, numbers are rounded or NA
Warning message:
NAs introduced by coercion

> S1<-as.numeric(as.character(S))  #doesn't work, numbers are rounded or NA
Warning message:
NAs introduced by coercion

If it can help, my column S is part of a DF that has been obtained via this
line :

>pose=read.csv2("LSCPos1.csv",sep=",",dec=".",as.is=T,h=T,skip=1)[,c(4,8,14,
15)]

pose <-
structure(list(DESCRIPTION = c("WHEAT May/10 ", "WHEAT May/10 ", 
"WHEAT May/10 ", "WHEAT May/10 ", "COTTON NO.2 May/10 ", "COTTON NO.2 May/10
", 
"COTTON NO.2 May/10 ", "PLATINUM Jul/10 ", " SUGAR NO.11 May/10 ", 
" SUGAR NO.11 May/10 ", " SUGAR NO.11 May/10 ", " SUGAR NO.11 May/10 ", 
" SUGAR NO.11 May/10 ", "ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10)
May/10 ", 
"ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10 ", 
"ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10 ", 
"ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10 ", 
"ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10 ", 
"ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10 ", 
"PRM HGH GD ALUMINIUM USD 09/07/10 ", "PRM HGH GD ALUMINIUM USD 09/07/10 ", 
"PRIMARY NICKEL USD 04/06/10 ", "PRIMARY NICKEL USD 04/06/10 ", 
"PRIMARY NICKEL USD 10/06/10 ", "PRIMARY NICKEL USD 10/06/10 ", 
"STANDARD LEAD USD 01/07/10 ", "STANDARD LEAD USD 01/07/10 ", 
"STANDARD LEAD USD 01/07/10 ", "STANDARD LEAD USD 01/07/10 ", 
"STANDARD LEAD USD 01/07/10 ", "STANDARD LEAD USD 01/07/10 ", 
"STANDARD LEAD USD 01/07/10 ", "STANDARD LEAD USD 06/07/10 ", 
"SPCL HIGH GRADE ZINC USD 08/07/10 ", "SPCL HIGH GRADE ZINC USD 08/07/10 ", 
"SPCL HIGH GRADE ZINC USD 08/07/10 ", "SPCL HIGH GRADE ZINC USD 09/07/10 ", 
"SPCL HIGH GRADE ZINC USD 09/07/10 ", "SPCL HIGH GRADE ZINC USD 09/07/10 ", 
"SPCL HIGH GRADE ZINC USD 09/07/10 ", "SPCL HIGH GRADE ZINC USD 09/07/10 ", 
"SPCL HIGH GRADE ZINC USD 13/04/10 ", "SPCL HIGH GRADE ZINC USD 13/04/10 "
), CREATED.DATE = structure(c(14705, 14707, 14707, 14711, 14700, 
14700, 14711, 14711, 14708, 14708, 14708, 14711, 14711, 14707, 
14707, 14707, 14707, 14707, 14708, 14708, 14708, 14708, 14708, 
14708, 14708, 14708, 14708, 14672, 14673, 14678, 14678, 14700, 
14700, 14700, 14700, 14700, 14700, 14700, 14705, 14707, 14707, 
14707, 14708, 14708, 14708, 14708, 14708, 14622, 14634), class = "Date"), 
    QUANITY = c(1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 2, 1, 
    1, 1, 2, 1, 1, 1, 1, 2, 1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 
    1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1, -1, 1, 1, 1, -1), CLOSING.PRICE =
c("467.7500", 
    "467.7500", "467.7500", "467.7500", "78.1300", "78.1300", 
    "78.1300", "1,739.4000", "16.5400", "16.5400", "16.5400", 
    "16.5400", "16.5400", "1,353.0000", "1,353.0000", "1,353.0000", 
    "1,353.0000", "1,353.0000", "1,353.0000", "1,353.0000", "1,353.0000", 
    "1,353.0000", "1,353.0000", "1,353.0000", "1,353.0000", "2,415.9000", 
    "2,415.9000", "25,755.7100", "25,755.7100", "25,760.8600", 
    "25,760.8600", "2,355.9600", "2,355.9600", "2,355.9600", 
    "2,355.9600", "2,355.9600", "2,355.9600", "2,355.9600", "2,357.1200", 
    "2,420.7300", "2,420.7300", "2,420.7300", "2,421.0500", "2,421.0500", 
    "2,421.0500", "2,421.0500", "2,421.0500", "2,388.4300", "2,388.4300"
    )), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANITY", 
"SETTLEMENT"), row.names = c(NA, -49L), class = "data.frame")

> str(pose)
'data.frame':   49 obs. of  4 variables:
 $ DESCRIPTION : chr  "WHEAT May/10 " "WHEAT May/10 " "WHEAT May/10 " "WHEAT
May/10 " ...
 $ CREATED.DATE:Class 'Date'  num [1:49] 14705 14707 14707 14711 14700 ...
 $ QUANITY     : num  1 1 1 1 1 1 1 -1 1 1 ...
 $ SETTLEMENT  : chr  "467.7500" "467.7500" "467.7500" "467.7500" ...


"Pose$SETTLEMENT" has a "character" class, when it should have been
"numeric". So maybe a solution would be to give a numeric class when I read
my .csv file?
I tried to change class of this column right after the read.csv()(using
type.convert() let me with a factor), but again got some rounded number or
NA.

So, what am I supposed to do??

TY for the help.



More information about the R-help mailing list