[R] read.table: how to ignore errors?

R. Michael Weylandt michael.weylandt at gmail.com
Tue Jan 24 22:33:10 CET 2012


Given your domain name, you might also get some use out of the
system() and system2() commands which allow the passing of strings to
the OS command line (and thus the use of tools like grep/sed/awk
within R)

E.g., an idiom I use pretty frequently for interactive data analysis:
(not really related, but I think it makes a good example)

FunctionToAnalyzeSomething <- function(...){
    pdf("junk.pdf")

    # plot stuff

    dev.off()
    system(paste("open", getwd(), "junk.pdf", sep = " "))
    if(readline("Keep?") == "y") system("cp junk.pdf FileOutput.pdf")
    unlink("junk.pdf") # or system("rm junk.pdf")
}

I would imagine you could use tryCatch + as.character() to get the bad
line number, and then make a temp file without that line with Unix
tools, and read that in. Some sort of determined.read.table() wrapper
to read.table()...

Musing out loud...
Michael

On Tue, Jan 24, 2012 at 4:00 PM, Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:
> On 24/01/2012 3:45 PM, Sam Steingold wrote:
>>
>> I get this error from read.table():
>> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
>>  :
>>   line 234 did not have 8 elements
>> The error is genuine (an extra field separator between 1st and 2nd
>> element).
>>
>> 1. is there a way to see this bad line 234 from R without diving into the
>> file?
>
>
> You could use readLines.  Skip 233 lines, read one.
>
>
>> 2. is there a way to ignore the bad lines and get the data from the good
>> lines only (I do want to see the bad lines, but I don't want to stop all
>> work until some issue which causes 1% of data is resolved).
>
>
> I think you would have to read the first part up to line 233, then read the
> part after line 234, then use rbind to join the two parts.  The latter might
> be tricky if you need a header line; it may be easiest to rewrite the file
> to a tempfile().
>
> Duncan Murdoch
>
>
>> thanks.
>>
>> Oh, yeah, a reproducible example:
>>
>> read.csv from
>> =====
>> a,b
>> 1,2
>> 3,4
>> 5,,6
>> 7,8
>> =====
>> I want to be able to extract the data frame
>>   a b
>> 1 1 1
>> 2 3 4
>> 3 7 8
>>
>> and a list of strings of length 1 containing "5,,6".
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list