[R] convert Factor as numeric
arnaud Gaboury
arnaud.gaboury at gmail.com
Thu Apr 29 13:12:44 CEST 2010
Dear group,
I know this issue has been already covered, and before you reply I must say
I have read the R-FAQ and search the mailing list archive.
I still can't manage to change my factor to numeric as I couldn't find any
clear answer.
Here is my df :
Pose1 <-
structure(list(DESCRIPTION = structure(c(1L, 2L, 3L, 4L, 5L,
8L), .Label = c(" SUGAR NO.11 May/10 ", "COTTON NO.2 May/10 ",
"PLATINUM Jul/10 ", "ROBUSTA COFFEE (10) May/10 ", "WHEAT May/10 ",
"PRIMARY NICKEL USD", "PRM HGH GD ALUMINIUM USD", "SPCL HIGH GRADE ZINC
USD",
"STANDARD LEAD USD"), class = "factor"), POSITION = c(5, 3, -1,
15, 4, 2), SETTLEMENT = structure(c(3L, 5L, 2L, 1L, 4L, 8L), .Label =
c("1,353.0000",
"1,739.4000", "16.5400", "467.7500", "78.1300", "25,760.8600",
"2,415.9000", "2,421.0500", "2,357.1200"), class = "factor")), .Names =
c("DESCRIPTION",
"POSITION", "SETTLEMENT"), row.names = c("1", "2", "3", "4",
"5", "51"), class = "data.frame")
>S<-Pose1$SETTLEMENT #select the last column
> S
[1] 16.5400 78.1300 1,739.4000 1,353.0000 467.7500 2,421.0500
Levels: 1,353.0000 1,739.4000 16.5400 467.7500 78.1300 25,760.8600
2,415.9000 2,421.0500 2,357.1200
> str(S)
Factor w/ 9 levels "1,353.0000","1,739.4000",..: 3 5 2 1 4 8
Now I need to change S to numeric class
> S1<-as.numeric(levels(S))[as.integer(S)] #doesn't work, numbers are
rounded or NA
Warning message:
NAs introduced by coercion
> S1<-as.numeric(levels(S))[S] #doesn't work, numbers are rounded or NA
Warning message:
NAs introduced by coercion
> S1<-as.numeric(as.character(S)) #doesn't work, numbers are rounded or NA
Warning message:
NAs introduced by coercion
If it can help, my column S is part of a DF that has been obtained via this
line :
>pose=read.csv2("LSCPos1.csv",sep=",",dec=".",as.is=T,h=T,skip=1)[,c(4,8,14,
15)]
pose <-
structure(list(DESCRIPTION = c("WHEAT May/10 ", "WHEAT May/10 ",
"WHEAT May/10 ", "WHEAT May/10 ", "COTTON NO.2 May/10 ", "COTTON NO.2 May/10
",
"COTTON NO.2 May/10 ", "PLATINUM Jul/10 ", " SUGAR NO.11 May/10 ",
" SUGAR NO.11 May/10 ", " SUGAR NO.11 May/10 ", " SUGAR NO.11 May/10 ",
" SUGAR NO.11 May/10 ", "ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10)
May/10 ",
"ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10 ",
"ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10 ",
"ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10 ",
"ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10 ",
"ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10 ",
"PRM HGH GD ALUMINIUM USD 09/07/10 ", "PRM HGH GD ALUMINIUM USD 09/07/10 ",
"PRIMARY NICKEL USD 04/06/10 ", "PRIMARY NICKEL USD 04/06/10 ",
"PRIMARY NICKEL USD 10/06/10 ", "PRIMARY NICKEL USD 10/06/10 ",
"STANDARD LEAD USD 01/07/10 ", "STANDARD LEAD USD 01/07/10 ",
"STANDARD LEAD USD 01/07/10 ", "STANDARD LEAD USD 01/07/10 ",
"STANDARD LEAD USD 01/07/10 ", "STANDARD LEAD USD 01/07/10 ",
"STANDARD LEAD USD 01/07/10 ", "STANDARD LEAD USD 06/07/10 ",
"SPCL HIGH GRADE ZINC USD 08/07/10 ", "SPCL HIGH GRADE ZINC USD 08/07/10 ",
"SPCL HIGH GRADE ZINC USD 08/07/10 ", "SPCL HIGH GRADE ZINC USD 09/07/10 ",
"SPCL HIGH GRADE ZINC USD 09/07/10 ", "SPCL HIGH GRADE ZINC USD 09/07/10 ",
"SPCL HIGH GRADE ZINC USD 09/07/10 ", "SPCL HIGH GRADE ZINC USD 09/07/10 ",
"SPCL HIGH GRADE ZINC USD 13/04/10 ", "SPCL HIGH GRADE ZINC USD 13/04/10 "
), CREATED.DATE = structure(c(14705, 14707, 14707, 14711, 14700,
14700, 14711, 14711, 14708, 14708, 14708, 14711, 14711, 14707,
14707, 14707, 14707, 14707, 14708, 14708, 14708, 14708, 14708,
14708, 14708, 14708, 14708, 14672, 14673, 14678, 14678, 14700,
14700, 14700, 14700, 14700, 14700, 14700, 14705, 14707, 14707,
14707, 14708, 14708, 14708, 14708, 14708, 14622, 14634), class = "Date"),
QUANITY = c(1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 2, 1,
1, 1, 2, 1, 1, 1, 1, 2, 1, 1, -1, 1, 1, -1, -1, 1, 1, -1,
1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1, -1, 1, 1, 1, -1), CLOSING.PRICE =
c("467.7500",
"467.7500", "467.7500", "467.7500", "78.1300", "78.1300",
"78.1300", "1,739.4000", "16.5400", "16.5400", "16.5400",
"16.5400", "16.5400", "1,353.0000", "1,353.0000", "1,353.0000",
"1,353.0000", "1,353.0000", "1,353.0000", "1,353.0000", "1,353.0000",
"1,353.0000", "1,353.0000", "1,353.0000", "1,353.0000", "2,415.9000",
"2,415.9000", "25,755.7100", "25,755.7100", "25,760.8600",
"25,760.8600", "2,355.9600", "2,355.9600", "2,355.9600",
"2,355.9600", "2,355.9600", "2,355.9600", "2,355.9600", "2,357.1200",
"2,420.7300", "2,420.7300", "2,420.7300", "2,421.0500", "2,421.0500",
"2,421.0500", "2,421.0500", "2,421.0500", "2,388.4300", "2,388.4300"
)), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANITY",
"SETTLEMENT"), row.names = c(NA, -49L), class = "data.frame")
> str(pose)
'data.frame': 49 obs. of 4 variables:
$ DESCRIPTION : chr "WHEAT May/10 " "WHEAT May/10 " "WHEAT May/10 " "WHEAT
May/10 " ...
$ CREATED.DATE:Class 'Date' num [1:49] 14705 14707 14707 14711 14700 ...
$ QUANITY : num 1 1 1 1 1 1 1 -1 1 1 ...
$ SETTLEMENT : chr "467.7500" "467.7500" "467.7500" "467.7500" ...
"Pose$SETTLEMENT" has a "character" class, when it should have been
"numeric". So maybe a solution would be to give a numeric class when I read
my .csv file?
I tried to change class of this column right after the read.csv()(using
type.convert() let me with a factor), but again got some rounded number or
NA.
So, what am I supposed to do??
TY for the help.
More information about the R-help
mailing list