[R] misbehavior with extract_numeric() from tidyr

Jim Lemon drjimlemon at gmail.com
Mon Apr 20 12:09:16 CEST 2015


Hi arnaud,
At a guess, it is the two hyphens that are present in those strings. I
think that the function you are using interprets them as subtraction
operators and since the string following the hyphen would produce NA,
the result would be NA.

Jim


On Mon, Apr 20, 2015 at 7:46 PM, arnaud gaboury
<arnaud.gaboury at gmail.com> wrote:
> On Mon, Apr 20, 2015 at 9:10 AM, arnaud gaboury
> <arnaud.gaboury at gmail.com> wrote:
>> R 3.2.0 on Linux
>> --------------------------------
>>
>> library(tidyr)
>>
>> playerStats <- c("LVL 10", "5,671,448 AP l6,000,000 AP", "Unique
>> Portals Visited 1,038",
>> "XM Collected 15,327,123 XM", "Hacks 14,268", "Resonators Deployed 11,126",
>> "Links Created 1,744", "Control Fields Created 294", "Mind Units
>> Captured 2,995,484 MUs",
>> "Longest Link Ever Created 75 km", "Largest Control Field 189,731 MUs",
>> "XM Recharged 3,006,364 XM", "Portals Captured 1,204", "Unique Portals
>> Captured 486",
>> "Resonators Destroyed 12,481", "Portals Neutralized 1,240", "Enemy
>> Links Destroyed 3,169",
>> "Enemy Control Fields Destroyed 1,394", "Distance Walked 230 km",
>> "Max Time Portal Held 240 days", "Max Time Link Maintained 15 days",
>> "Max Link Length x Days 276 km-days", "Max Time Field Held 4days",
>> "Largest Field MUs x Days 83,226 MU-days")
>>
>> -----------------------------------------------------------------------------------------------
>>  extract_numeric(playerStats)
>>  [1]             10 56714486000000           1038       15327123
>>    14268          11126           1744            294        2995484
>> [10]             75         189731        3006364           1204
>>      486          12481           1240           3169           1394
>> [19]            230            240             15             NA
>>        4             NA
>>
>> ------------------------------------------------------------------------------------------------
>>  playerStats[c(22,24)]
>> [1] "Max Link Length x Days 276 km-days"      "Largest Field MUs x
>> Days 83,226 MU-days"
>> --------------------------------------------------------------------------------------------
>>
>> I do not understand why these two vectors return NA when the function
>> extract_numeric() works well for others,
>>
>> Any wrong settings in my env?
>
> -------------------------------------------------------------------------
>  as.numeric(gsub("[^0-9]", "",playerStats))
>  [1]             10 56714486000000           1038       15327123
>    14268          11126           1744            294        2995484
> [10]             75         189731        3006364           1204
>      486          12481           1240           3169           1394
> [19]            230            240             15            276
>        4          83226
> --------------------------------------------------------------------
>
> The above command does the job, but I still can not figure out why
> extract_numeric() returns two NA
>
>>
>> Thank you for hints.
>>
>>
>>
>> --
>>
>> google.com/+arnaudgabourygabx
>
>
>
> --
>
> google.com/+arnaudgabourygabx
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list