[R] R 3.1 changes to type.convert causing strings where I used to get numeric
murdoch.duncan at gmail.com
Wed Apr 30 22:40:12 CEST 2014
On 30/04/2014, 4:20 PM, Bos, Roger wrote:
> Dear R-help,
> I recently upgraded to R 3.1 patched and code that ran fine previously and now giving a lot of errors because the data is coming in as strings instead of numeric. I can fix my code to wrapping each item I want to use with as.numeric(), but that seems very inefficient.
> I looked at the change list for R 3.1 and I see the first item is a change in type.convert() that seems to be causing me grief. The suggestion is to use colClasses, but when I try to do so I get an error regarding the quotes again...
>> ann1 <- read.table(driveletter %+% "/snap/ann/snap_fyr_ann1_" %+% i %+% ".txt", header=TRUE, quote="", as.is=TRUE, colClasses='numeric')
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
> scan() expected 'a real', got '"1033.7"'
> Does anyone have any suggestions?
This sounds like two separate problems. The first is the change to
type.convert(). That's going to be dealt with (it's already in R-devel,
will eventually be handled in R-patched and 3.1.1). You deserve part of
the blame for this for never testing the pre-release version. It would
have been easier to fix before release, but nobody who tested then
bothered to report it.
The second problem is one I don't recall hearing reported before. If
you have a .csv file containing the lines
then it should be readable as a .csv, because the quotes should be
stripped. In fact it is readable now if you *don't* specify that the
column is numeric, and it will be converted to a numeric value.
However, if you do use colClasses="numeric" you'll get an error. That
looks wrong, though it is consistent with ?read.table.
> x <- c("X", '"1"')
> read.csv(textConnection(x), colClasses="numeric")
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,
scan() expected 'a real', got '"1"'
> CHANGES IN R 3.1.0:
> NEW FEATURES
> type.convert() (and hence by default read.table()) returns a character vector or factor when representing a numeric input as a double would lose accuracy. Similarly for complex inputs.
> If a file contains numeric data with unrepresentable numbers of decimal places that are intended to be read as numeric, specify colClasses in read.table() to be"numeric".
> This message is for the named person's use only. It may
> contain confidential, proprietary or legally privileged
> information. No right to confidential or privileged treatment
> of this message is waived or lost by an error in transmission.
> If you have received this message in error, please immediately
> notify the sender by e-mail, delete the message and all
> copies from your system and destroy any hard copies. You must
> not, directly or indirectly, use, disclose, distribute,
> print or copy any part of this message if you are not
> the intended recipient.
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help