[R] read.csv and field containing single quotes
Benilton Carvalho
beniltoncarvalho at gmail.com
Tue Mar 27 01:09:38 CEST 2012
I need to read in csv files, created by 3rd party, with fields
containing single quotes (as shown below).
"header1","header2","header3","header4"
"field1r1","field2r1","field3r1","field4r1"
"field1r2","field2r2","field3r2PartA), field3r2PartB Very" Long","field4r2"
"field1r3","field2r3","field3r3","field4r3"
read.csv(filename, quote="\"'", header=TRUE) won't read the file
represented above, unless the 3rd line has Very"" (double quotes)
instead of Very" (single quotes)... and this is documented (scan() man
page).
Assuming that the creation of such csv files is something I'm not in a
position to interfere with, are there (preferably, "all in R")
suggestions on how to handle such task?
For the moment, I'm using my poor man's solution (below), but any
tricks that would simplify this task would be great.
Thank you very much,
benilton
parser <- function(fname, header=TRUE, stringsAsFactors=FALSE){
txt <- readLines(fname)
txt <- gsub("^\"|\"$", "", txt)
txt <- strsplit(txt, "\",\"")
txt <- do.call(rbind, lapply(txt, function(x) gsub("\"", "\"\"", x)))
if (header){
nms <- txt[1,]
txt <- txt[-1,]
}
txt <- as.data.frame(txt, stringsAsFactors=stringsAsFactors)
if (header) names(txt) <- nms
txt
}
More information about the R-help
mailing list