[R] read.table mystery

Sun Mar 6 18:47:58 CET 2011

Thank you for pointing this out. This is really inconvenient as I do not 
know a priori how many and where those darn cases containing an additional 
(or more) ":" might be ... 

The seems to work, but will fail if there's a "1:sdfjhlfkh:2:adlkjf" 
somewhere (1 & 2 both integerable).

na.exclude(as.integer(scan("/tmp/testfile.txt",sep=":",what="integer")))

More robust pointers anyone?

Joh

Sarah Goslee wrote:

> Not so much a mystery. read.table() only looks at the first 5 lines when
> decided how many columns your file has (as described in the Details
> section of the help).
> 
> The easiest solution is to add a col.names argument to read.table() with
> the correct number of names.
> 
> You may want to also include as.is=TRUE if you don't want your data to
> be imported as factors. If you expect character but have factor you may
> get unexpected results later.
> 
> Sarah
> 
> On Sun, Mar 6, 2011 at 5:04 AM, Johannes Graumann
> <johannes_graumann at web.de> wrote:
>> Hello,

>>
>> Please have a look at the code below, which I use to read in the attached
>> file. As line 18 of the file reads "1065:>sp|Q9V3T9|ADRO_DROME
>> NADPH:adrenodoxin oxidoreductase, mitochondrial OS=Drosophila
>> melanogaster GN=dare PE=2 SV=1", I expect the code below to produce a 3
>> column data frame with most of the last column empty and line 18 to
>> produce a data.frame row like so:
>>
>> V1
>>        1065
>> V2
>>        >sp|Q9V3T9|ADRO_DROME NADPH
>> V3
>>        adrenodoxin oxidoreductase, mitochondrial OS=Drosophila
>> melanogaster GN=dare PE=2 SV=1
>>
>> Why is that not so?
>>
>> Thanks for any hint.
>>
>> Sincerely, Joh
>>
>> read.table(
>>  "/tmp/testfile.txt",
>>  sep=":",
>>  header=FALSE,
>>  quote="",
>>  fill=TRUE
>> )[19,]
> 
> ---
> Sarah Goslee
> http://www.functionaldiversity.org