[R] Converting chr to num
Spencer Graves
@pencer@gr@ve@ @end|ng |rom e||ect|vede|en@e@org
Mon Aug 20 07:39:20 CEST 2018
Have you considered "Ecfun::asNumericChar" (and
"Ecfun::asNumericDF")?
DF <- data.frame(variable = c("12.6% ", "30.9%", "61.4%", "1"))
Ecfun::asNumericChar(DF$variable)
[1] 0.126 0.309 0.614 1.000
If you read the documentation including the examples, you will
see that many of these issues and others are handled automatically in
the way that I thought was the most sensible. If you disagree, we can
discuss other examples and perhaps modify the code for those functions.
Spencer Graves
On 2018-08-20 00:26, Rui Barradas wrote:
> Hello,
>
> Inline.
>
> On 20/08/2018 01:08, Daniel Nordlund wrote:
>> See comment inline below:
>>
>> On 8/18/2018 10:06 PM, Rui Barradas wrote:
>>> Hello,
>>>
>>> It also works with class "factor":
>>>
>>> df <- data.frame(variable = c("12.6%", "30.9%", "61.4%"))
>>> class(df$variable)
>>> #[1] "factor"
>>>
>>> as.numeric(gsub(pattern = "%", "", df$variable))
>>> #[1] 12.6 30.9 61.4
>>>
>>>
>>> This is because sub() and gsub() return a character vector and the
>>> instruction becomes an equivalent of what the help page ?factor
>>> documents in section Warning:
>>>
>>> To transform a factor f to approximately its original numeric
>>> values, as.numeric(levels(f))[f] is recommended and slightly more
>>> efficient than as.numeric(as.character(f)).
>>>
>>>
>>> Also, I would still prefer
>>>
>>> as.numeric(sub(pattern = "%$","",df$variable))
>>> #[1] 12.6 30.9 61.4
>>>
>>> The pattern is more strict and there is no need to search&replace
>>> multiple occurrences of '%'.
>>
>> The pattern is more strict, and that could cause the conversion to
>> fail if the process that created the strings resulted in trailing
>> spaces.
>
> That's true, and I had thought of that but it wasn't in the OP's
> problem description.
> The '$' could still be used with something like "%\\s*$":
>
> as.numeric(sub('%\\s*$', '', df$variable))
> #[1] 12.6 30.9 61.4
>
>
> Rui Barradas
>
>
>> Without the '$' the conversion succeeds.
>>
>> df <- data.frame(variable = c("12.6% ", "30.9%", "61.4%"))
>> as.numeric(sub('%$', '', df$variable))
>> [1] NA 30.9 61.4
>> Warning message:
>> NAs introduced by coercion
>>
>>
>> <<<snip>>>
>>
>>
>> Dan
>>
>
> ---
> This email has been checked for viruses by AVG.
> https://www.avg.com
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list