[R] How to split a data.frame into its columns?
David Winsemius
dwinsemius at comcast.net
Mon Aug 29 08:27:32 CEST 2016
> On Aug 28, 2016, at 11:14 PM, Marius Hofert <marius.hofert at uwaterloo.ca> wrote:
>
> Hi,
>
> I need a fast way to split a data.frame (and matrix) into a list of
> columns.
This is a bit of a puzzle since data.frame objects are by definition "lists of columns".
If you want a data.frame object (say it's name is dat) to _only be a list of columns then
dat <- unclass(dat)
The split.data.frame function splits by rows since that is the most desired and expected behavior and because the authors of S/R probably thought there was no point in making the split "by columns" when it already was.
--
David.
> For matrices, split(x, col(x)) works (which can then be done
> in C for speed-up, if necessary), but for a data.frame? split(iris,
> col(iris)) does not work as expected (?).
> The outcome should be lapply(seq_len(ncol(iris)), function(j)
> iris[,j]) and not require additional packages (if possible).
>
> Thanks & cheers,
> Marius
>
> PS: Below is the C code for matrices. Not sure how easy it would be to
> extend that to data.frames (?)
>
> SEXP col_split(SEXP x)
> {
> /* Setup */
> int *dims = INTEGER(getAttrib(x, R_DimSymbol));
> int n = dims[0], d = dims[1];
> SEXP res = PROTECT(allocVector(VECSXP, d));
> SEXP ref;
> int i = 0, j, k;
>
> /* Distinguish int/real matrices */
> switch (TYPEOF(x)) {
> case INTSXP:
> for(j = 0; j < d; j++) {
> SET_VECTOR_ELT(res, j, allocVector(INTSXP, n));
> int *e = INTEGER(VECTOR_ELT(res, j));
> for(k = 0 ; k < n ; i++, k++) {
> e[k] = INTEGER(x)[i];
> }
> }
> break;
> case REALSXP:
> for(j = 0; j < d; j++) {
> SET_VECTOR_ELT(res, j, allocVector(REALSXP, n));
> double *e = REAL(VECTOR_ELT(res, j));
> for(k = 0 ; k < n ; i++, k++) {
> e[k] = REAL(x)[i];
> }
> }
> break;
> case LGLSXP:
> for(j = 0; j < d; j++) {
> SET_VECTOR_ELT(res, j, allocVector(LGLSXP, n));
> int *e = LOGICAL(VECTOR_ELT(res, j));
> for(k = 0 ; k < n ; i++, k++) {
> e[k] = LOGICAL(x)[i];
> }
> }
> break;
> case STRSXP:
> for(j = 0; j < d; j++) {
> ref = allocVector(STRSXP, n);
> SET_VECTOR_ELT(res, j, ref);
> ref = VECTOR_ELT(res, j);
> for(k = 0 ; k < n ; i++, k++) {
> SET_STRING_ELT(ref, k, STRING_ELT(x, i));
> }
> }
> break;
> default: error("Wrong type of 'x': %s", CHAR(type2str_nowarn(TYPEOF(x))));
> }
>
> /* Return */
> UNPROTECT(1);
> return(res);
> }
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list