[R] Looking for simple line-splitting code

peter dalgaard pd@|gd @end|ng |rom gm@||@com
Wed Feb 5 21:27:09 CET 2025


A 3rd option could be

scan(text=x, what="", blank.lines.skip=FALSE)

(all because readLines() doesn't obey the text=x convention, perhaps it should? I'm unsure whether the textConnection is left open in Rui's method.)

-pd

> On 5 Feb 2025, at 15:35 , Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
> 
> Thanks to Rui, Peter and Tanvir!  Peter's seems to be the fastest of the 3 suggestions so far on the little test case, but on the real data (where x contains several thousand lines), Rui's seems best.
> 
> Duncan
> 
> On 2025-02-05 9:13 a.m., peter dalgaard wrote:
>> This also seems to work:
>>> strsplit(paste(x,collapse="\n"),"\n")[[1]]
>> [1] "abc" "def" ""    "ghi"
>>> On 5 Feb 2025, at 14:44 , Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>>> 
>>> If I have this object:
>>> 
>>>  x <- c("abc\ndef", "", "ghi")
>>> 
>>> and I write it to a file using `writeLines(x, "test.txt")`, my text editor sees a 5 line file:
>>> 
>>>  1: abc
>>>  2: def
>>>  3:
>>>  4: ghi
>>>  5:
>>> 
>>> which is what I'd expect:  the last line in the editor is empty.  If I use `readLines("test.txt")` on that file, I get the vector
>>> 
>>>  c("abc", "def", "", "ghi")
>>> 
>>> and all of that is fine.
>>> 
>>> What I'm looking for is simple code that modifies x to the `readLines()` output, without actually writing and reading it.
>>> 
>>> My first attempt doesn't work:
>>> 
>>>  unlist(strsplit(x, "\n"))
>>> 
>>> because it leaves out the blank line 3.  I can fix that with this ugly code:
>>> 
>>>  lines <- strsplit(x, "\n")
>>>  lines[sapply(lines, length) == 0] <- list("")
>>>  lines <- unlist(lines)
>>> 
>>> Surely there's a simpler way to do this?  I'd like to use just base functions, no other packages.
>>> 
>>> Duncan Murdoch
>>> 
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com



More information about the R-help mailing list