[Rd] C API to get numrow of data frame
Kevin Ushey
kevinushey at gmail.com
Tue Apr 1 04:51:37 CEST 2014
The safest way is to check the length of the row.names attribute, e.g.
length(getAttrib(df, R_RowNamesSymbol)).
This protects you from both data.frames with zero columns, as well as
corrupted data.frames containing columns with different lengths, since
by definition the number of rows in a data.frame is defined by its
row.names attribute. However, R will internally un-collapse a
collapsed row.names on this getAttrib call, which is probably
undesired for very large data.frames.
One way of getting around this is calling .row_names_info from R, e.g.
(modulo my errors):
int df_nrows(SEXP s) {
if (!Rf_inherits(s, "data.frame")) Rf_error("expecting a data.frame");
SEXP two = PROTECT(Rf_ScalarInteger(2));
SEXP call = PROTECT( Rf_lang3(
Rf_install(".row_names_info"),
s,
two
) );
SEXP result = PROTECT(Rf_eval(call, R_BaseEnv));
int output = INTEGER(result)[0];
UNPROTECT(3);
return output;
}
More ideally (?), such a function could be added to util.c and
exported by R, e.g. (again, modulo my errors):
int df_nrows(SEXP s) {
if (!inherits(s, "data.frame")) error("expecting a data.frame");
SEXP t = getAttrib0(s, R_RowNamesSymbol);
if (isInteger(t) && INTEGER(t)[0] == NA_INTEGER && LENGTH(t) == 2)
return abs(INTEGER(t)[1]);
else
return LENGTH(t);
}
or even incorporated into the already available 'nrows' function.
Although there is probably someone out there depending on 'nrows'
returning the number of columns for their data.frame...
Cheers,
Kevin
On Mon, Mar 31, 2014 at 6:27 PM, Murray Stokely <murray at stokely.org> wrote:
> I didn't look at the names because I believe that would be incorrect
> if the row names were stored internally in the compact form.
>
> See ?.set_row_names (hat tip, Tim Hesterberg who showed me this years ago) :
>
> 'row.names' can be stored internally in compact form.
> '.set_row_names(n)' generates that form for automatic row names of
> length 'n', to be assigned to 'attr(<a data frame>, "row.names")'.
> '.row_names_info' gives information on the internal form of the
> row names for a data frame: for details of what information see
> the argument 'type'.
>
> The function I wrote obviously doesn't work for 0 row or 0 column
> data.frames, you need to check for that.
>
> On Mon, Mar 31, 2014 at 6:12 PM, Gábor Csárdi <csardi.gabor at gmail.com> wrote:
>> I think it is actually better to check the length of the row names. In case
>> the data frame has zero columns. (FIXME, of course.)
>>
>> Gabor
>>
>>
>> On Mon, Mar 31, 2014 at 8:04 PM, Murray Stokely <murray at stokely.org> wrote:
>>>
>>> The simplest case would be:
>>>
>>> int num_rows = Rf_length(VECTOR_ELT(dataframe, 0));
>>> int num_columns = Rf_length(dataframe);
>>>
>>> There may be edge cases for which this doesn't work; would need to
>>> look into how the dim primitive is implemented to be sure.
>>>
>>> - Murray
>>>
>>>
>>> On Mon, Mar 31, 2014 at 4:40 PM, Sandip Nandi <sannandi at umail.iu.edu>
>>> wrote:
>>> > Hi ,
>>> >
>>> > Is there any C API to the R API nrow of dataframe ?
>>> >
>>> > x<- data.frame()
>>> > n<- nrow(x)
>>> > print(n)
>>> > 0
>>> >
>>> >
>>> > Example :
>>> > My C function which deals with data frame looks like and I don't to send
>>> > the number of rows of data frame .I want to detect it from the function
>>> > itself, my function take data frame as argument and do some on it. I
>>> > want
>>> > API equivalent to nrow. I tried Rf_nrows,Rf_ncols . No much help.
>>> >
>>> > SEXP writeRR(SEXP dataframe) {
>>> >
>>> > }
>>> >
>>> >
>>> > Any help is very appreciated.
>>> >
>>> > Thanks,
>>> > Sandip
>>> >
>>> > [[alternative HTML version deleted]]
>>> >
>>> > ______________________________________________
>>> > R-devel at r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list