[R] Weird read.xls behavior
David Winsemius
dwinsemius at comcast.net
Tue May 10 18:02:19 CEST 2011
On May 10, 2011, at 11:39 AM, Gabor Grothendieck wrote:
> On Tue, May 10, 2011 at 12:12 AM, Jun Shen <jun.shen.ut at gmail.com>
> wrote:
>> Kenneth,
>>
>> Thanks for the reply. I checked the original data. There is no
>> space. I even
>> manually added a space to one value. After reading in with
>> read.xls, the
>> value has two spaces. The reason I don't like it is I am going to
>> do some
>> comparison with another dataset, which is supposed to be the same
>> as this
>> one. Now I am getting a bunch of false negatives.
>
> It seems that the perl program underlying gdata's read.xls puts out
> lines like this:
While we are on the topic of gdata functions I just looked at the trim
function and find that it does return a data.frame when one is offered
to it. (It was not clear from the documentation that a dataframe fell
under the classification of "character strings and other related
objects.") Using a dataframe in my workspace from an earlier question:
> require(gdata)
> Mat1$Time[1] <- "09:30 "
> Mat1
Weight Date Time
1 7.6 04/28/11 09:30
2 8.4 04/29/11 03:11
3 8.6 04/29/11 05:32
4 8.6 04/29/11 09:53
5 1.4 05/01/11 19:52
> trim(Mat1)
Weight Date Time
1 7.6 04/28/11 09:30 # no space on my console
2 8.4 04/29/11 03:11
3 8.6 04/29/11 05:32
4 8.6 04/29/11 09:53
5 1.4 05/01/11 19:52
> nchar(trim(Mat1)$Time[1])
[1] 5
> nchar(Mat1$Time[1])
[1] 6
--
David.
>
> |"KAI-4169-002","830","5 mg" |
> where | mark the beginning and end and are not part of the line.
> read.csv includes the space after the last double quote in the last
> field even though its outside of the double quote.
>
> As an interim fix, edit the file at this location:
>
> system.file("perl", "xls2csv.pl", package = "gdata")
>
> removing the space before the \n in this line:
> print OutFile "$outputLine \n"
> so it becomes this:
> print OutFile "$outputLine\n"
>
> Now it should work.
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list