[R] How to split a data.frame into its columns?

Marius Hofert marius.hofert at uwaterloo.ca
Mon Aug 29 08:14:20 CEST 2016


Hi,

I need a fast way to split a data.frame (and matrix) into a list of
columns. For matrices, split(x, col(x)) works (which can then be done
in C for speed-up, if necessary), but for a data.frame? split(iris,
col(iris)) does not work as expected (?).
The outcome should be lapply(seq_len(ncol(iris)), function(j)
iris[,j]) and not require additional packages (if possible).

Thanks & cheers,
Marius

PS: Below is the C code for matrices. Not sure how easy it would be to
extend that to data.frames (?)

SEXP col_split(SEXP x)
{
    /* Setup */
    int *dims = INTEGER(getAttrib(x, R_DimSymbol));
    int n = dims[0], d = dims[1];
    SEXP res = PROTECT(allocVector(VECSXP, d));
    SEXP ref;
    int i = 0, j, k;

    /* Distinguish int/real matrices */
    switch (TYPEOF(x)) {
    case INTSXP:
    for(j = 0; j < d; j++) {
    SET_VECTOR_ELT(res, j, allocVector(INTSXP, n));
    int *e = INTEGER(VECTOR_ELT(res, j));
    for(k = 0 ; k < n ; i++, k++) {
    e[k] = INTEGER(x)[i];
    }
    }
    break;
    case REALSXP:
    for(j = 0; j < d; j++) {
    SET_VECTOR_ELT(res, j, allocVector(REALSXP, n));
    double *e = REAL(VECTOR_ELT(res, j));
    for(k = 0 ; k < n ; i++, k++) {
    e[k] = REAL(x)[i];
    }
    }
    break;
    case LGLSXP:
    for(j = 0; j < d; j++) {
    SET_VECTOR_ELT(res, j, allocVector(LGLSXP, n));
    int *e = LOGICAL(VECTOR_ELT(res, j));
    for(k = 0 ; k < n ; i++, k++) {
    e[k] = LOGICAL(x)[i];
    }
    }
    break;
    case STRSXP:
    for(j = 0; j < d; j++) {
ref = allocVector(STRSXP, n);
    SET_VECTOR_ELT(res, j, ref);
    ref = VECTOR_ELT(res, j);
    for(k = 0 ; k < n ; i++, k++) {
    SET_STRING_ELT(ref, k, STRING_ELT(x, i));
    }
    }
    break;
    default: error("Wrong type of 'x': %s", CHAR(type2str_nowarn(TYPEOF(x))));
    }

    /* Return */
    UNPROTECT(1);
    return(res);
}



More information about the R-help mailing list