[R] read.table: how to ignore errors?
Rolf Turner
rolf.turner at xtra.co.nz
Tue Jan 24 22:38:07 CET 2012
On 25/01/12 09:45, Sam Steingold wrote:
> I get this error from read.table():
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
> line 234 did not have 8 elements
> The error is genuine (an extra field separator between 1st and 2nd element).
>
> 1. is there a way to see this bad line 234 from R without diving into the file?
>
> 2. is there a way to ignore the bad lines and get the data from the good
> lines only (I do want to see the bad lines, but I don't want to stop all
> work until some issue which causes 1% of data is resolved).
>
> thanks.
>
> Oh, yeah, a reproducible example:
>
> read.csv from
> =====
> a,b
> 1,2
> 3,4
> 5,,6
> 7,8
> =====
> I want to be able to extract the data frame
> a b
> 1 1 1
> 2 3 4
> 3 7 8
>
> and a list of strings of length 1 containing "5,,6".
Try:
xxx <- readLines("<filename>")
hhh <- read.csv(textConnection(xxx[1]),header=FALSE)
yyy <- hhh[-1,]
names(yyy) <- hhh[1,]
bad <- list()
j <- 0
for(i in 2:length(xxx)) {
tmp <- read.csv(textConnection(xxx[i]),header=FALSE)
if(ncol(tmp)==ncol(yyy)) yyy <- rbind(yyy,tmp) else {
j <- j+1
bad[[j]] <- tmp
}
}
closeAllConnections()
HTH
cheers,
Rolf Turner
More information about the R-help
mailing list