[R] read.csv issue

Marc Schwartz (via MN) mschwartz at mn.rr.com
Wed Aug 16 21:10:14 CEST 2006


On Wed, 2006-08-16 at 14:43 -0400, Doran, Harold wrote:
> I'm trying to read in some data from a .csv format and have come across
> the following issue. Here is a simple example for replication
> 
> # A sample .csv format
> schid,sch_name
> 331-802-7081,School One
> 464-551-7357,School Two
> 388-517-7627,School Three \& Four
> 388-517-4394,School Five
> 
> Note the third line includes the \ character. However, when I read the
> data in I get
> 
> > read.csv(file.choose())
>          schid              sch_name
> 1 331-802-7081            School One
> 2 464-551-7357            School Two
> 3 388-517-7627 School Three & Four
> 4 388-517-4394           School Five
> 
> It turns out to be very important to read in this character as I have a
> program that loops through a data set and Sweave's about 30,000 files.
> The variable sch_name gets dropped into the tex file using
> \Sexpr{tmp$sch_name}. However, if there is an &, the latex file won't
> compile properly. So, what I need is for the data to be read in as
> 
>          schid              sch_name
> 1 331-802-7081            School One
> 2 464-551-7357            School Two
> 3 388-517-7627 School Three \& Four
> 4 388-517-4394           School Five
> 
> I am obligated by a client to include the & in the school name, so
> eliminating that isn't an option. I thought maybe comment.char or quote
> would be what I needed, but they didn't resolve the issue. I'm certain
> I'm missing something simple, I just can't see it.
> 
> Any thoughts?
> 
> Harold

Harold,

What version of R and OS are you running?

Under:

 Version 2.3.1 Patched (2006-08-06 r38829)

 on FC5:

> read.csv("test.csv")
         schid              sch_name
1 331-802-7081            School One
2 464-551-7357            School Two
3 388-517-7627 School Three \\& Four
4 388-517-4394           School Five

The '\' is doubled.

Take note of the impact of the 'allowEscapes' argument:

> read.csv("test.csv", allowEscapes = TRUE)
         schid            sch_name
1 331-802-7081          School One
2 464-551-7357          School Two
3 388-517-7627 School Three & Four
4 388-517-4394         School Five

The '\' is lost.

Try it with 'allowEscapes = FALSE' explicitly.

HTH,

Marc Schwartz



More information about the R-help mailing list