[R] Weird error (special character) of read.table
Henrik Bengtsson
hb at biostat.ucsf.edu
Tue Feb 22 17:00:02 CET 2011
On Tue, Feb 22, 2011 at 7:43 AM, John Edwards <jhnedwards603 at gmail.com> wrote:
> Hi,
>
> I have the following input file.
> $ cat main.txt
> CEL_A CELL_B
> 1 4
> 2 5
> 2 6
>
> Then I run read.table in R.
>
>> f=read.table('main.txt', header=T, check.names=F, sep='\t')
>> head(f)
> \ufeffCEL_A CELL_B
> 1 1 4
> 2 2 5
> 3 2 6
>> f$CEL_A
> NULL
>
> I'm not sure where the special character \ufeff comes from. Could anybody
> let me know what is the problem?
Looks like the Unicode character called 'byte order mark' (BOM), cf.
http://en.wikipedia.org/wiki/Byte_order_mark
It looks like your 'main.txt' text file was created by a software that
saves it as a Unicode-encoded text file. If you need a plain
old-style ASCII text file, see if you can resave it as such. With
last year's development in R, it also not unlikely that you can tell R
to read in the existing file by specifying the encoding, but since I
don't now how to do that I leave that as an search-the-help exercise
for you.
/Henrik
>
> Thanks,
> John
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list