[R] read.table() with "\t" as seperator, all other programs report equal fields each row, read.table() returns unequal row length error

Yong Wang wangyong1 at gmail.com
Wed Mar 16 17:37:20 CET 2011


hi, list

R is undoudtedly my favorite statistic tool, however, the data
inputnpart has long been a pain. most data I have to deal with are
irregular and contains special character.

Recently I get a tab delimited data, read.table(filename,sep="\t")
constantly return erors for certain rows does not has xyz elements
while all other programs such as perl,python, awk all report equal row
length if use "\t" as seperator.

I scout through the problematic row, sometimes it is because a row
contains a "#", so I go back to specify comment.char=""
next it will be some other problems, for some rows I simply can't
figure out what the problem is.

can I have any guru suggestion to save this pain now and in the
future, is CSV a safer format? or can anyone let me know what are the
fundamental principles I must bear in mind when do preliminary data
processing using other programs such as perl to ensure the output can
be readily feed into R.

best

yong



More information about the R-help mailing list