[R] Wide character in print?

Marc Schwartz marc_schwartz at me.com
Mon Feb 4 18:33:17 CET 2013


On Feb 4, 2013, at 10:39 AM, Spencer Graves <spencer.graves at structuremonitoring.com> wrote:

> Hello:
> 
> 
> 	  Googling for "Wide characters in print" led me to a discussion that pushed me to review the "read.table" help page.  Careful study there suggested I try setting "fileEncoding" to something;  it suggested I look at the "Encoding" section in the help file for "file".  This suggested that anything I got to work on my computer might not be portable.
> 
> 
> 	  Suggestions?
> 	  Thanks,
> 	  Spencer
> 	
> 
> ###########################
> 
> 
>      I get "Wide character in print" from trying read.xls("22_data.xls") in the gdata package, with "22_data.xls" downloaded from "Varieties_Country_A-E.xls" at "http://www.reinhartandrogoff.com/data/browse-by-topic/topics/7/":
> 
> 
>> library(gdata)
>> read.xls("22_data.xls")
> Wide character in print at C:/Users/sgraves/pgms/R/R-2.15.2/library/gdata/perl/xls2csv.pl line 270.
>> sessionInfo()
> R version 2.15.2 (2012-10-26)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> 
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods base
> 
> other attached packages:
> [1] gdata_2.12.0
> 
> loaded via a namespace (and not attached):
> [1] gtools_2.7.0
> 
> 
>      I get the same message from xls2sep("22_data.xls").
> 
> 
>      It's only a comment, so I suppose I could ignore it.  However, it's generated by a function I'm adding to the Ecdat package, and I'd rather find a way to avoid it.  (I suppose I could dump it to sink, but that's pretty extreme and could mask other problems.)
> 
> 
>      Thanks,
>      Spencer
> 

Spencer,

The error message is coming from Perl, not from R and from what I understand, is typically encountered when there are UTF-8/Unicode characters in the source. "Wide character" apparently referring to multi-byte encodings.

Having downloaded the Excel file you indicate above, my first reaction is that it is not really structured in a way to facilitate automated parsing to a CSV file (the intermediate step before using read.table()) to then be read into R to a data frame. They are not purely rows and columns of data, which is the typical application for read.xls().

There are lengthy header lines in the worksheets, some of which include copyright symbols, which is likely why you are getting the error from Perl. There are also embedded objects in the worksheets, which appear to be image crops of tables from a paper. I honestly don't know if read.xls() is set up to handle that stuff and you may need to contact the maintainers.

Given the above, I am not sure what I would recommend if your goal is to parse the raw data contained in the Excel worksheets and include them in a package. You may need to copy and paste the data ranges to the OS clipboard and read them into R from there, or consider using a different R package that has more flexibility in defining the specific Excel worksheet cell ranges that you want to extract.

Others may have different ideas for you.

Regards,

Marc Schwartz



More information about the R-help mailing list