[Rd] wchar and wstring. (followup question)

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Aug 30 08:39:23 CEST 2005


On Mon, 29 Aug 2005, James Bullard wrote:

> Thanks for all of the help with my previous posts. This question might
> expose much of my ignorance when it comes to R's memory managment and
> the responsibilities of the programmer, but I thought I had better ask
> rather than continue in my ignorance.
>
> I use the following code to create a multi-byte string in R from my wide
> character string in C.
>
> int str_length;
> char* cstr;
>
> str_length = cel.GetAlg().size();
> cstr = Calloc(str_length, char);
> wcstombs(cstr, cel.GetAlg().c_str(), str_length);
> SET_STRING_ELT(names, i, mkChar("Algorithm"));
> SET_VECTOR_ELT(vals, i++, mkString(cstr));
> Free(cstr);
>
> My first question is: do I need the Free? I looked at some of the
> examples in main/character.c, but I could not decide whether or not I
> needed it. I imagined (I could not find the source for this function)
> that mkString made a copy so I thought I would clean up my copy, but if
> this is not the case then I would assume the Free would be wrong.

Yes, nkString copies the information (via mkChar) and the Free is needed.
The source is in gram.c.

Whenever you call Calloc you need to call Free or you will have a memory 
leak.

> My second question is: It was pointed out to me that it would be more
> natural to use this code:
>
> SET_STRING_ELT(vals, i++, mkChar(header.GetHeader().c_str()));
>
> instead of:
>
> SET_VECTOR_ELT(vals, i++, mkString(header.GetHeader().c_str()));
>
> However, the first line creates the following list element in R:

Nope, it creates an element of a character vector.  The point was that it 
was more natural to use character vectors than lists of single-element 
character vectors.

> <CHARSXP: "Percentile">
>
> Whereas, I want it to create as the list element:
>
> "Percentile"
>
> Which the second example does correctly. I had previously posted about
> this problem and I believe that I was advised to use the second syntax,
> but maybe there is a different problem in my code. I am trying to
> construct a named list in R where my first line SET_STRING_ELT sets the
> name of the list element and the second sets the value where the value
> can be an INTEGER, STRING or whatever.

Well, we STRING etc are macros not types, but if you want a list of mixed 
types for use at R-level, SET_VECTOR_ELT is fine.

> My third question is simply, why is wcrtomb preferred, the example i
> based my code of of in main/character.c used  wcstombs.

preferred to wctomb.  Please re-read what I wrote, and the relevant man 
pages.


>
> Thanks again for all of the help.
>
> jim
>
>
> Prof Brian Ripley wrote:
>
>> On Fri, 26 Aug 2005, James Bullard wrote:
>>
>>> Hello all, I am writing an R interface to some C++ files which make use
>>> of std::wstring classes for internationalization. Previously (when I
>>> wanted to make R strings from C++ std::strings), I would do something
>>> like this to construct a string in R from the results of the parse.
>>>
>>> SET_VECTOR_ELT(vals, i++, mkString(header.GetHeader().c_str()));
>>
>>
>> That creates a list of one-element character vectors.  It would be more
>> usual to do
>>
>>   SET_STRING_ELT(vals, i++, mkChar(header.GetHeader().c_str()));
>>
>>> However, now the call header.GetHeader().c_str() returns a pointer to
>>> an array of wchar_t's. I was going to use wcstombs() to convert the
>>> wchar_t* to char*, but I wanted to see if there was a similar
>>> function in R for the mkString function which I had initially used
>>> which deals with wchar_ts as opposed to chars.
>>
>>
>> No (nor an analogue of mkChar).  R uses MBCS and not wchar_t
>> internally (and Unix-alike systems do externally).  There is no
>> wchar_t internal R type (a much-debated design decision at the time).
>>
>>> Also, since I have no experience with the wctombs() function I wanted
>>> to ask if anyone knew if this will handle the internationilzation
>>> issues from within R.
>>
>>
>> Did you mean wcstombs or wctomb (if the latter, wcrtomb is preferred)?
>> There are tens of examples in the R sources for you to consult.
>>
>> Note that not all R platforms support wchar_t, hence this code is
>> surrounded by #ifdef SUPPORT_MBCS macros (exported in Rconfig.h for
>> package writers).
>>
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list