[R] invalid multibyte string at '<b0>C'
Fisher Dennis
fisher at plessthan.com
Sat Apr 12 01:02:40 CEST 2014
R 3.0.2
OS X Mavericks
Colleagues
I have a file that I converted from SAS (sas7bdat) to CSV (filename: ORIGINAL.csv). I try to read it with read.csv and I receive the error message:
Error in type.convert(data[[i]], as.is = as.is[i], dec = dec, na.strings = character(0L)) :
invalid multibyte string at '<b0>C’
The problem resolves if I delete a single character from each of lines 2 and 4 of the file (filename: FIXED.csv)
readLines can read both files without problem and displays the offending character as:
\xb0
which appears to be a degree sign.
I also tried:
read.csv(textConnection(readLines(“ORIGINAL.csv”)))
and encountered the same error message.
In the past, I have encountered the same problem with Greek symbols (e.g., mu) and other special characters.
Short of editing the input file, is there a simple solution within R so that I can read the input data into a dataframe?
One possible (but ugly) solution would be:
TEMP <- readLines(FILENAME)
TEMP <- gsub(offendingcharacter, replacementcharacter, TEMP)
However, this would require that I find all possible offending characters and the corresponding replacements.
The files are available for inspection at:
http://www.plessthan.com/FILES/ARCHIVE.zip
Dennis
Dennis Fisher MD
P < (The "P Less Than" Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com
More information about the R-help
mailing list