[R] Looking for simple line-splitting code
peter dalgaard
pd@|gd @end|ng |rom gm@||@com
Wed Feb 5 21:27:09 CET 2025
A 3rd option could be
scan(text=x, what="", blank.lines.skip=FALSE)
(all because readLines() doesn't obey the text=x convention, perhaps it should? I'm unsure whether the textConnection is left open in Rui's method.)
-pd
> On 5 Feb 2025, at 15:35 , Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>
> Thanks to Rui, Peter and Tanvir! Peter's seems to be the fastest of the 3 suggestions so far on the little test case, but on the real data (where x contains several thousand lines), Rui's seems best.
>
> Duncan
>
> On 2025-02-05 9:13 a.m., peter dalgaard wrote:
>> This also seems to work:
>>> strsplit(paste(x,collapse="\n"),"\n")[[1]]
>> [1] "abc" "def" "" "ghi"
>>> On 5 Feb 2025, at 14:44 , Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>>>
>>> If I have this object:
>>>
>>> x <- c("abc\ndef", "", "ghi")
>>>
>>> and I write it to a file using `writeLines(x, "test.txt")`, my text editor sees a 5 line file:
>>>
>>> 1: abc
>>> 2: def
>>> 3:
>>> 4: ghi
>>> 5:
>>>
>>> which is what I'd expect: the last line in the editor is empty. If I use `readLines("test.txt")` on that file, I get the vector
>>>
>>> c("abc", "def", "", "ghi")
>>>
>>> and all of that is fine.
>>>
>>> What I'm looking for is simple code that modifies x to the `readLines()` output, without actually writing and reading it.
>>>
>>> My first attempt doesn't work:
>>>
>>> unlist(strsplit(x, "\n"))
>>>
>>> because it leaves out the blank line 3. I can fix that with this ugly code:
>>>
>>> lines <- strsplit(x, "\n")
>>> lines[sapply(lines, length) == 0] <- list("")
>>> lines <- unlist(lines)
>>>
>>> Surely there's a simpler way to do this? I'd like to use just base functions, no other packages.
>>>
>>> Duncan Murdoch
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk Priv: PDalgd using gmail.com
More information about the R-help
mailing list