[R] factor(300000, levels=1:300000) gives NA
William Dunlap
wdunlap at tibco.com
Sat Sep 20 18:10:01 CEST 2014
You can work around this issue by matching the types of the the 'x'
and 'levels' arguments to factor():
> factor(300000, as.numeric(299999:300001)) # both are floating
point ('numeric')
[1] 3e+05
Levels: 299999 3e+05 300001
> factor(as.integer(300000), 299999:300001) # both are integer
[1] 300000
Levels: 299999 300000 300001
If the types do not match you get undesirable results
> factor(300000, 299999:300001) # x is numeric, levels is integer
[1] <NA>
Levels: 299999 300000 300001
> factor(300000L, as.numeric(299999:300001)) # x is integer, levels is numeric
[1] <NA>
Levels: 299999 3e+05 300001
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Sat, Sep 20, 2014 at 3:52 AM, Suharto Anggono Suharto Anggono
<suharto_anggono at yahoo.com> wrote:
> In R:
>
>> factor(300000, levels=1:300000)
> [1] <NA>
> 300000 Levels: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 ... 300000
>
> The NA above is undesirable in my view, because 300000 is in 1:300000.
>
>
> I have just got bitten by it.
>
>
> I have figured out why it happens. The results of 'as.character' are different.
>
>> as.character(300000)
> [1] "3e+05"
>> as.character((1:300000)[300000])
> [1] "300000"
>
>
>> sessionInfo()
> R version 3.1.1 (2014-07-10)
> Platform: i386-w64-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list