[R] invalid multibyte string at '<b0>C'

Fisher Dennis fisher at plessthan.com
Sat Apr 12 01:02:40 CEST 2014


R 3.0.2
OS X Mavericks

Colleagues

I have a file that I converted from SAS (sas7bdat) to CSV (filename: ORIGINAL.csv).  I try to read it with read.csv and I receive the error message:
	Error in type.convert(data[[i]], as.is = as.is[i], dec = dec, na.strings = character(0L)) : 
	  invalid multibyte string at '<b0>C’
The problem resolves if I delete a single character from each of lines 2 and 4 of the file (filename: FIXED.csv)

readLines can read both files without problem and displays the offending character as:
	\xb0
which appears to be a degree sign.

I also tried:
	read.csv(textConnection(readLines(“ORIGINAL.csv”)))
and encountered the same error message.

In the past, I have encountered the same problem with Greek symbols (e.g., mu) and other special characters.

Short of editing the input file, is there a simple solution within R so that I can read the input data into a dataframe?
One possible (but ugly) solution would be:
	TEMP	<- readLines(FILENAME)
	TEMP	<- gsub(offendingcharacter, replacementcharacter, TEMP)
However, this would require that I find all possible offending characters and the corresponding replacements.

The files are available for inspection at:
	http://www.plessthan.com/FILES/ARCHIVE.zip

Dennis

Dennis Fisher MD
P < (The "P Less Than" Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com




More information about the R-help mailing list