[Rd] (PR#8777) strsplit does [not] return correct value when spliting ""

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Apr 17 20:01:08 CEST 2006


On Mon, 17 Apr 2006, Charles Dupont wrote:

> Now using R 2.3.0.
>
> I have a string that can be "".  I want to find the max screen width of
> the all the lines in the string. so I run the command
>
>  > x <- c("hello", "bob is\ngreat", "foo", "", "bar")
>  > substrings <- strsplit(x, "\n"), type="width")
>  > sapply(substrings, FUN=function(x) max(nchar(x, type="width")))
> which returns
> [1]    5    6    3 -Inf    3

That's a problem with your use of max.  Try max(0, ...).

> This happens because of the behavior of strsplit for a string that is not ""
>  > strsplit("Hello\nBob", "\n")
>
> it returns
> [[1]]
> [1] "Hello" "Bob"
>
>
> for a string that is ""
>  > strsplit("", "\n")
>
> it returns
> [[1]]
> character(0)
>
>
> I would expect
> [[1]]
> [1] ""
>
> because "" is character vector of length 1 containing a string of length
> 0, not a character vector of length 0.
>
> For any other string if the split string is not matched in argument x
> then it returns the original string x.
>
> The man page states in the value section that strsplit returns:
>      A list of length 'length(x)' the 'i'-th element of which contains
>      the vector of splits of 'x[i]'.
>
> It mentions no change in behavior if the value of x[i] = "".

There is none, for there are no splits in that case.  I did ask you to 
point to the documentation of the rule you are assuming, and I can't find 
any.

>
> Prof Brian Ripley wrote:
>> Please use a current version of R: we are at 2.3.0RC (and we do ask you
>> not to report on obselete versions).
>>
>> What rule are you using, and where did you find it in the R documentation?
>>
>> In fact
>>
>>> strsplit("", " ")
>>
>> [[1]]
>> character(0)
>>
>> which is not as you stated.   This is a feature, as it distinct from
>>
>>> strsplit(" ", " ")
>>
>> [[1]]
>> [1] ""
>>
>> Consider also
>>
>>> strsplit("", "")
>>
>> [[1]]
>> character(0)
>>
>>> strsplit("a", "")
>>
>> [[1]]
>> [1] "a"
>>
>>> strsplit("ab", "")
>>
>> [[1]]
>> [1] "a" "b"
>>
>>
>> On Mon, 17 Apr 2006, charles.dupont at vanderbilt.edu wrote:
>>
>>> Full_Name: Charles Dupont
>>> Version: 2.2.0
>>> OS: linux
>>> Submission from: (NULL) (160.129.129.136)
>>>
>>>
>>> when
>>>
>>> strsplit("", " ")
>>>
>>> returns character(0)
>>>
>>> where as
>>>
>>> strsplit("a", " ")
>>>
>>> returns "a".
>>>
>>> these return values are not constiant with each other.
>>>
>>> Charles Dupont
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>>
>>
>
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list