[R] Eliminate level information

darrelkj darrelkj at mail.uc.edu
Sat Jul 9 21:38:28 CEST 2011


Hi, I hope this formatting is correct as it is my first time.

I am trying to do comparisons of values in a data frame that has some factor
variables.
One instance is 

> train$sex[2]
[1]  Male
Levels:  Female  Male

So the value is Male but a comparison like "Male" == train$sex[2]
will always return FALSE because of the level information included.

Another problem this creates is 

> factor(train$workclass[25:30])
[1]  Private    Local-gov  Private    NA         Private  
[6]  Private  
Levels:  Local-gov  NA  Private

> is.na(train$workclass[25:30])
[1] FALSE FALSE FALSE FALSE FALSE FALSE

Which they are all false because of the levels data in the comparison.  This
would seem to be bug because I thought that NA was a protected keyword but
it is being used here as a level.  Which will make it fail the missing value
criteria for two reasons now because it is a level.

I tried a conversion using data.matrix() but that gets rid of all factor
information and makes things worse.  Is there a way to suppress 'Levels: 
Female  Male'.

I hope this makes, thanks.

--
View this message in context: http://r.789695.n4.nabble.com/Eliminate-level-information-tp3656643p3656643.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list