[R] Loading 'akward' data file

John Fox jfox at mcmaster.ca
Wed Jun 16 15:17:50 CEST 2004


Dear Andy and Iago,

Here's a quick and dirty solution which, I think, produces the desired
result. The solution depends upon the structure of the data file being the
same as Iago's example, but could be made more flexible:

> readFile <- function(fileName){
+     con <- file(fileName, open="r")
+     lines <- readLines(con)
+     result <- list()
+     line <- 1
+     first <- TRUE
+     while (line <= length(lines)) {
+         if (length(grep("^var", lines[line])) > 0) {
+             if (!first) result[[name]] <- values
+             first <- FALSE
+             name <- lines[line]
+             values <- NULL
+             }
+         else {
+             values <- c(values,
eval(parse(text=paste("c(",lines[line],")"))))
+             }
+         line <- line + 1
+         }
+     result[[name]] <- values
+     close(con)
+     result
+     }
>     
> readFile("c:/temp/test.txt")
$var1
[1] 123.33

$var2
[1] 938

$var3
[1] 1 1 1 1 1 1

$var4
 [1] 1 2 3 4 5 1 2 3 4 5

I hope that this helps,
 John 

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Liaw, Andy
> Sent: Wednesday, June 16, 2004 7:31 AM
> To: 'Iago Mosqueira'; r-help at stat.math.ethz.ch
> Subject: RE: [R] Loading 'akward' data file
> 
> Generally you'd use file() to open the file, then use 
> readLines(), say inside a while() loop to read one `chunk' at 
> a time.  However, your example looks a bit strange.  The 
> possibility of empty line makes it a bit more complicated, by 
> that last couple of lines seems to suggest that you could 
> have a line of data follow by another line of data without 
> variable label.
> If that's true, I don't know how you would parse the file...
> 
> Andy
> 
> > From: Iago Mosqueira
> > 
> > Hello,
> > 
> > I need to load a somehow diffilcult data file. It has lines with 
> > variable names followed by a variable number of rows and columns of 
> > data, separated from the next variable sometimes by a blank line, 
> > sometimes simply by the new variable name. For example:
> > 
> > var1
> > 123.33
> > var2
> > 938
> > 
> > var3
> > 1,1,1,1,1,1
> > var4
> > 1,2,3,4,5
> > 1,2,3,4,5
> > 
> > What would be the best startegy for loading a file like 
> this? I would 
> > like to have it staored as a list with the variable names 
> used to name 
> > the slots. Any pointers?
> > 
> > Many thanks,
> > 
> > 
> > Iago




More information about the R-help mailing list