[R] problem with scan recognizing newline '\n'

Peter Dalgaard P.Dalgaard at biostat.ku.dk
Wed Jun 17 12:19:02 CEST 2009


Mark Kimpel wrote:
> I'm using R to do some file processing in Linux and am trying to read
> in the output of find . -type f -print >
> ~/Music_Archives_search_problem/ls.output.find.txt
> 
> This command yields a text file with each line representing the full
> path name of all files in the directory and subdirs. Unfortunately,
> there seem to be some special characters that interfere with scan
> recognizing '\n' as newline. At least that's what I assume the problem
> is, but I can't identify which those might be or how to correct the
> problem. Below is my code and the problem output followed by
> sessionInfo(). This is executed in a loop, with i starting from zero.
> I also tried with 'allowEscapes = TRUE', but that made no difference.
> As you can see, the first FLAC file is followed by a '\n', which is
> ignored. This seems to happen about once in every 20 file names, so it
> does work properly most of the time. Also, when the file is opened in
> emacs, the newlines are recognized.
> 
> current.line <- scan("~/Music_Archives_search_problem/ls.output.find.txt",
>                        skip = i, nlines = 1, what = 'character', sep =
> "@", allowEscapes = FALSE)
> 
> [1] "./Christian/Christian Gospel/Chanticleer/Chanticleer - How Sweet
> the Sound; Spirituals & Traditional Gosp - 04 - Soon One Mornin
> Medley; Soon One Mornin-What You Gon Do When the
> ....flac\n./Christian/Christian Gospel/Chanticleer/Chanticleer - How
> Sweet the Sound; Spirituals & Traditional Gosp - 05 - Didnt It
> Rain.flac"
>

Hmm, do those songs have apostrophes in the title? Check the quote=
argument.

(Is the sep="@" actually doing anything? Otherwise readLines() would be
more to the point...)

-pd

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907




More information about the R-help mailing list