[R] issue with nzchar() ?

Tue Aug 7 21:25:25 CEST 2012

On Mon, Aug 6, 2012 at 5:27 PM, R. Michael Weylandt
<michael.weylandt at gmail.com> wrote:
> On Mon, Aug 6, 2012 at 9:53 AM, Liviu Andronic <landronimirc at gmail.com> wrote:
>> On Mon, Aug 6, 2012 at 4:48 PM, Liviu Andronic <landronimirc at gmail.com> wrote:
>>> string, something that I find strange. At best NA is the equivalent of
>>> an empty string.
>
> Certainly not to my mind, unless you think that zero and NA should be
> the same for integers and doubles as well. NA (in whatever form) is,
> to my mind, _unknown_ which is very different than knowing 0.
>
This is a tricky question and I don't have a strong opinion yet.

> I'm not sure why that's the case, but it's documented on the help page
> (under value):
>
>  For ‘nchar’, an integer vector giving the sizes of each element,
>      currently always ‘2’ for missing values (for ‘NA’).
>
I most certainly missed this bit in the help page.

> My guess is that it's this way for back-compatability from a time when
> there probably wasn't a proper NA_character_ (that's the parser
> literal for a character NA) and they really were just "NA" (the
> string) -- perhaps in some far distant R 3.0 we'll see
> nchar(NA_character_) = NA_integer_
>
As David has also suggested (and Bert alluded), it may be worth having
a nchar(..., returnNA=FALSE) argument, which if TRUE would return NA
when it encounters NA values in the original vector.

Thank you all for the comments. Regards
Liviu