[R] read.table: how to ignore errors?

Sam Steingold sds at gnu.org
Tue Jan 24 22:40:57 CET 2012


> * Duncan Murdoch <zheqbpu.qhapna at tznvy.pbz> [2012-01-24 16:00:14 -0500]:
>
> On 24/01/2012 3:45 PM, Sam Steingold wrote:
>> I get this error from read.table():
>> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
>>    line 234 did not have 8 elements
>> The error is genuine (an extra field separator between 1st and 2nd element).
>>
>> 1. is there a way to see this bad line 234 from R without diving into the file?
>
> You could use readLines.  Skip 233 lines, read one.

This is no good.
What if the data is compressed (or coming from a socket)?
What if the line is 233,000,000?
How do I extract that 234 number from the error message? is there an
exception object or something?

>> 2. is there a way to ignore the bad lines and get the data from the good
>> lines only (I do want to see the bad lines, but I don't want to stop all
>> work until some issue which causes 1% of data is resolved).
>
> I think you would have to read the first part up to line 233, then
> read the part after line 234, then use rbind to join the two parts.
> The latter might be tricky if you need a header line; it may be
> easiest to rewrite the file to a tempfile().

this is awkward. what if there are many errors there?

https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14793

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000
http://pmw.org.il http://www.memritv.org http://truepeace.org
http://camera.org http://openvotingconsortium.org http://iris.org.il
Incorrect time synchronization.



More information about the R-help mailing list