[Rd] .Call and data frames
Kasper Daniel Hansen
khansen at stat.Berkeley.EDU
Thu Jun 22 18:21:27 CEST 2006
On Jun 22, 2006, at 9:04 AM, Dominick Samperi wrote:
> Thanks for the tips,
>
> This seems to work:
> First test for isReal and isInteger.
> If they fail, assume character/factor, and
>
> PROECT(colData = coerceVector(colData,INTSXP); // Not STRSXP
> SEXP names = getAttrib(colData, R_LevelsSymbol);
> // names now contains the string names I was looking for.
But of course be aware that there is a map from colData to names
(here I am guessing the C implementation mirrors what is happening in
R), where you have
> test = data.frame(tmp =c ("a","a","b"))
> levels(test$tmp)
[1] "a" "b"
That is, you only have one occurrence of each level. So you need to
take care of this remapping. Unless all you need is the different
possible values.
And thanks to Brian Ripley who taught me something new about data
frames: that it is indeed possible to have characters (although it is
quite explicit in the man page on data.frame)
/Kasper
> ds
>
> Hin-Tak Leung wrote:
>> Hin-Tak Leung wrote:
>>> I think you want
>>> else if (TYPEOF(colData) == STRSXP)
>>> ... instead.
>>>
>>> I don't know if this will convert from factors to string's,
>>> but somewhere it probably involves something like this:
>>> PROTECT(colData = coerceVector(colData, STRSXP));
>>
>> FWIW, a factor consists of all these things internally:
>> (1) TYPEOF(colData) is INTSXP
>> (2) attr(colData, "levels") exists and is a STRSXP type
>> (string representation for the levels).
>> (3) class(colData) = "factor"
>>
>> if coerVector() doesn't do it, you can test for (3) in your C code,
>> and use the integer vector in (1) to index into the string vector
>> in (2)
>> to regenerate the string vector manually.
>>
>> Not all of this is correct, just an idea:
>>
>> class = getAttrib(colData, R_ClassSymbol);
>> ...
>> if (.../* do some test on class */...)
>> {
>> levels = getAttrib(colData, "levels");
>> PROTECT(back_to_str = allocVector(STRSXP, LENGTH(colData));
>> for(int i = 0; i < LENGTH=(colData) ; i++)
>> {
>> SET_STRING_ELT(back_to_str, i,
>> mkChar(STRING_ELT(levels, INTEGER(colData)[i])));
>> }
>> }
>>
>>>
>>> Dominick Samperi wrote:
>>>> Hello,
>>>>
>>>> I'm trying to fetch a data frame through the C API,
>>>> and have no problem doing this when all columns
>>>> are numbers, but when there is a column of
>>>> strings I have a problem. On the C-side the
>>>> function looks like:
>>>> SEXP myfunc(SEXP df),
>>>> and it is called with a dataframe from
>>>> the R side with:
>>>>
>>>> .Call("myfunc", somedataframe)
>>>>
>>>> On the C side (actually C++ side) I use code
>>>> like this:
>>>> SEXP colnames = getAttrib(df, R_NamesSymbol)
>>>> cname = string(CHAR(STRING_ELT(colnames,i))
>>>> SEXP coldata = VECTOR_ELT(df,i) (data for i-th column)
>>>> if(isReal(colData))
>>>> x = REAL(colData)[j];
>>>> else if(isInteger(colData))
>>>> i = INTEGER(colData)[j];
>>>> else if(isString(colData))
>>>> s = CHAR(STRING_ELT(colData,j))
>>>>
>>>> The problem is that the last test (isString) never passes,
>>>> even when I pass in a frame for which one or more cols
>>>> contain character strings. When the column contains
>>>> strings the isVector(colData) test passes, but no matter
>>>> how I try to fetch the string data I get a seg fault. That
>>>> is, forcing CHAR(STRING_ELT(colData,j)) will
>>>> fault, and so will VECTOR_ELT(colData,0), even
>>>> though colData passes the isVector test.
>>>>
>>>> Any ideas?
>>>> Thanks,
>>>> ds
>>>>
>>>> ______________________________________________
>>>> R-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list