[Rd] (PR#8777) strsplit does [not] return correct value when spliting ""

Thomas Friedrichsmeier thomas.friedrichsmeier at ruhr-uni-bochum.de
Mon Apr 17 23:50:38 CEST 2006


Prof Brian Ripley wrote:
> On Mon, 17 Apr 2006, Charles Dupont wrote:
[...]
> > The man page states in the value section that strsplit returns:
> >      A list of length 'length(x)' the 'i'-th element of which contains
> >      the vector of splits of 'x[i]'.
> >
> > It mentions no change in behavior if the value of x[i] = "".
>
> There is none, for there are no splits in that case.  I did ask you to
> point to the documentation of the rule you are assuming, and I can't find
> any.

No, the documentation does not explicitely mention this, but shouldn't "there 
are not splits" mean: So the string is returned unchanged?
Consider these examples - I don't think that's the behavior you'd expect 
unless told otherwise:

a <- "a"
b <- ""
a == strsplit (a, ",")	# TRUE
b == strsplit (b, ",")	# FALSE

So, maybe there is a general rule that empty elements get purged?

strsplit ("a,,b", ",")
[[1]]
[1] "a" ""  "b"

strsplit ("a", "a")
[[1]]
[1] ""

Apparently not so. Then why does an empty string get "split" to a non-existent 
string?

Note: I don't really care much about what the behavior is, but if the 
described behavior is indeed intended, I think it should be documented. IMO 
it's pretty counter intuitive.

Regards
Thomas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : https://stat.ethz.ch/pipermail/r-devel/attachments/20060417/3f6180dc/attachment.bin 


More information about the R-devel mailing list