[R] expected behavior when parsing lines with special characters
David Wolfskill
david at catwhisker.org
Tue Feb 15 18:26:16 CET 2011
On Tue, Feb 15, 2011 at 12:21:18PM -0500, Robert M. Flight wrote:
> Say I have a tab-delimited table I want to read into R. What should I
> expect to happen if some of the entries contain the character " ' "? I
> thought it would read the file fine, but that is not what happens.
> Instead, all the values in between two " ' "s get read into one field,
> and things are just seriously messed up. Is this a bug, and besides
> removing the offending characters, is there a fix?
>
> Example Input file:
>
> testFile.txt:
> 3499 9031 424823 COP'B2 118094989 XP_422637.2
> 3499 7955 114454 copb2 50080158 NP_001001940.1
> 3499 7227 45757 betaCop 24584107 NP_524836.2
> ...
>
> testDat <- read.table('testFile.txt',sep='\t')
> testDat
I believe you want to use:
testDat <- read.table('testFile.txt',sep='\t',quote="")
Ref.:
quote: the set of quoting characters. To disable quoting altogether,
use 'quote = ""'. See 'scan' for the behaviour on quotes
embedded in quotes. Quoting is only considered for columns
read as character, which is all of them unless 'colClasses'
is specified.
>...
Peace,
david
--
David H. Wolfskill david at catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.
See http://www.catwhisker.org/~david/publickey.gpg for my public key.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110215/3e644e00/attachment.bin>
More information about the R-help
mailing list