[Rd] Using unicode from C interface of R
Duncan Murdoch
murdoch.duncan at gmail.com
Wed Jan 22 01:08:47 CET 2014
On 14-01-21 5:41 PM, Sandip Nandi wrote:
> Hi ,
>
> I am using C interface of R . If a unicode string is read , in what format
> I could pass it back to R ?
> I was trying to use the following
>
> tpStr = ( char *)val;
> SET_STRING_ELT(innerList , 0, mkChar(tpStr));
>
> It does not work .
>
> If I pass it back from as RAW format to R , what package is there to read
> it ? I mean package for interpreting RAW data .
There are a number of encodings for Unicode. Most Unix systems use
UTF-8, Windows uses UTF-16 for some things, etc.
If your string is known to be in UTF-8 that's easiest: just use
mkCharCE instead of mkChar, as described in Writing R Extensions. If it
is in UTF-16 you might have more trouble because of possible embedded 0
bytes. Translate to UTF-8 first using C facilities like
WideCharToMultibyte.
Duncan Murdoch
More information about the R-devel
mailing list