[R] Please guide -- UTF-8 locale setting fails on Windows on writing
Sunny Singha
sunnysingha.analytics at gmail.com
Mon Mar 28 15:46:36 CEST 2016
Hi,
I think I'm experiencing an issue regarding system Locale. I have
exported '.csv' formatted data frames gathered from various social
media platforms like facebook/twitter/G+, etc.
I observe many variable/columns consists of strings formatted similar to below:
"<U+0645><U+062D><U+0645><U+062F>
<U+0627><U+0644><U+0633><U+0648><U+0627><U+062D>"
As expected and I confirmed, in social media data, they are strings in
different languages.
Platform details are provide in the end of this mail. OS locale is set
to English (United States) hence 'R' locale is 'English_United
States.1252'
I have attempted to change it to UTF-8 but receives below warning message:
Warning message:
In Sys.setlocale("LC_ALL", "UTF-8") :
OS reports request to set locale to "UTF-8" cannot be honored
I have gone through below forums but no resolution so far:
--- http://stackoverflow.com/questions/20571147/how-to-set-unicode-locale-in-r
--- https://stat.ethz.ch/pipermail/r-devel/2013-November/067940.html
--- http://stackoverflow.com/questions/19877676/write-utf-8-files-from-r
--- https://tomizonor.wordpress.com/2013/04/17/file-utf8-windows/
--- http://withr.me/configure-character-encoding-for-r-under-linux-and-windows/
I'm not sure whether the issue is while reading/extracting the data
from media or while writing/exporting in Windows directory, but I
don't experience similar issue in my personal Mac machine. I need some
clarification here.
How could I export the data just as I see on web ? Please guide.
Regards,
Sunny
Platform I'm using::::::::::::::::::::::::::::
Operating System : Windows 7 Professional SP1
R version details:
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 3
minor 2.3
year 2015
month 12
day 10
svn rev 69752
language R
version.string R version 3.2.3 (2015-12-10)
nickname Wooden Christmas-Tree
More information about the R-help
mailing list