[R] Weird error (special character) of read.table

Henrik Bengtsson hb at biostat.ucsf.edu
Tue Feb 22 17:00:02 CET 2011


On Tue, Feb 22, 2011 at 7:43 AM, John Edwards <jhnedwards603 at gmail.com> wrote:
> Hi,
>
> I have the following input file.
> $ cat main.txt
> CEL_A CELL_B
> 1 4
> 2 5
> 2 6
>
> Then I run read.table in R.
>
>> f=read.table('main.txt', header=T, check.names=F, sep='\t')
>> head(f)
>  \ufeffCEL_A CELL_B
> 1    1      4
> 2    2      5
> 3    2      6
>> f$CEL_A
> NULL
>
> I'm not sure where the special character \ufeff comes from. Could anybody
> let me know what is the problem?

Looks like the Unicode character called 'byte order mark' (BOM), cf.

  http://en.wikipedia.org/wiki/Byte_order_mark

It looks like your 'main.txt' text file was created by a software that
saves it as a Unicode-encoded text file.  If you need a plain
old-style ASCII text file, see if you can resave it as such.  With
last year's development in R, it also not unlikely that you can tell R
to read in the existing file by specifying the encoding, but since I
don't now how to do that I leave that as an search-the-help exercise
for you.

/Henrik

>
> Thanks,
> John
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list