[R] Building ragged dataframe (was Re: Data Gaps)
Keith Jewell
k.jewell at campden.co.uk
Thu Oct 14 13:37:13 CEST 2010
> "Dennis Murphy" <djmuser at gmail.com> wrote in message
> news:AANLkTinrAuFrh71J9WpNfZVg_8JAYACkw1e8CPn=cZ2g at mail.gmail.com...
> Hi:
>
> The essential problem is that after you append items, the result is a list
> with possibly unequal lengths. Trying to convert that into a data frame by
> the 'usual' methods (do.call(rbind, ...) or ldply() in plyr) didn't work
> (as
> anticipated). One approach is to initialize a maximum size matrix with NAs
> and then replace the NAs by the contents of each component of the list.
> The
> following function is far from elegant, but the idea is to output a data
> frame by 'NA filling' the shorter vectors in the list and doing the
> equivalent of do.call(rbind, list).
>
> listNAfill <- function(l) {
> # input argument l is a list with numeric component vectors
> lengths <- sapply(l, length)
> m <- matrix(NA, nrow = length(l), ncol = max(lengths))
> for(i in seq_len(length(l))) m[i, ] <- replace(m[i, ],
> 1:lengths[i],
> l[[i]])
> as.data.frame(m)
> }
<snip>
Leaping in on a detail, and apologising in advance if my comment isn't
relevant, I had need to build up a data frame from different length columns
(a ragged dataframe?). In case it helps, this function is what I ended up
with...
---------------
# utility function, combining different length objects into a dataframe
# padding short columns with NA
CDF <- function(x, y)
{ out <- merge(data.frame(x), data.frame(y), all = T, by = "row.names") #
merge
out$Row.names <- as.integer(out$Row.names) # make row names integer
data.frame(out[order(out$Row.names), -1], row.names = 1:length(out[[1]]))
# sort
}
----------------
I guess it could reasonably easily be extended to more than two arguments.
This met my needs, and I didn't look very hard for alternatives, so there
may be better approaches.
HTH
Keith J
More information about the R-help
mailing list