[Rd] .Call and data frames

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Jun 22 08:44:03 CEST 2006


On Wed, 21 Jun 2006, Kasper Daniel Hansen wrote:

> While I do not know how to handle this on the C level, I know that
> you do not have characters in data frames, everything is factors
> instead.

Not so.  The default in data.frame() is to convert character vector to 
factors, but there are many ways to have character vectors in data frames, 
and this will become more common in 2.4.0 and later.

I suspect that this may well be Dominick's problem, though.

isVector is just a test of being one of the several types of vectors: 
VECTOR_ELT is only appropriate for a VECSXP (a R-level list) and for this 
sort of thing it is much safer and cleaner to test TYPEOF.

> Internally they are coded as a number of integer levels,
> with the levels having labels (which is the character you see). So eg
> (in R):
>
> > test <- data.frame(tmp = letters[1:10])
> > test
>    tmp
> 1    a
> 2    b
> 3    c
> 4    d
> 5    e
> 6    f
> 7    g
> 8    h
> 9    i
> 10   j
> > is.character(test$temp)
> [1] FALSE
> > as.numeric(test$tmp) # The internal code of the factor
> [1]  1  2  3  4  5  6  7  8  9 10
> > levels(test$tmp) # gives you the translation from internal code to
> actual label
> [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
>
> You probably need to convert the factor to a character, which I do
> not know how to do in C on top of my head, but which is probably not
> that difficult. At least now you should have some idea on where to look.
>
> /Kasper
>
>
> On Jun 21, 2006, at 10:07 PM, Dominick Samperi wrote:
>
>> Hello,
>>
>> I'm trying to fetch a data frame through the C API,
>> and have no problem doing this when all columns
>> are numbers, but when there is a column of
>> strings I have a problem. On the C-side the
>> function looks like:
>> SEXP myfunc(SEXP df),
>> and it is called with a dataframe from
>> the R side with:
>>
>> .Call("myfunc", somedataframe)
>>
>> On the C side (actually C++ side) I use code
>> like this:
>> SEXP colnames = getAttrib(df, R_NamesSymbol)
>> cname  = string(CHAR(STRING_ELT(colnames,i))
>> SEXP coldata = VECTOR_ELT(df,i) (data for i-th column)
>> if(isReal(colData))
>>     x = REAL(colData)[j];
>> else if(isInteger(colData))
>>     i = INTEGER(colData)[j];
>> else if(isString(colData))
>>     s = CHAR(STRING_ELT(colData,j))
>>
>> The problem is that the last test (isString) never passes,
>> even when I pass in a frame for which one or more cols
>> contain character strings. When the column contains
>> strings the isVector(colData) test passes, but no matter
>> how I try to fetch the string data I get a seg fault. That
>> is, forcing CHAR(STRING_ELT(colData,j)) will
>> fault, and so will VECTOR_ELT(colData,0), even
>> though colData passes the isVector test.
>>
>> Any ideas?
>> Thanks,
>> ds
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list