[R] confirming behavior of "by"
Daryl Morris
darylm at uw.edu
Tue Sep 28 21:54:19 CEST 2010
Hi,
I'm using "by" to summarize by multiple groups, and want to extract the
returned into a pretty dataframe. I'm trying to find a simple way to
name the rows of the data frame. I'd like it to be something like
index1.val1.index2.val2 where the index1 and index2 are the names of the
indices and the val1 & val2 are names of possible values of the index.
(the calling function will do a bit more processing)
I had thought to use attr(byOut,"dimnames") for this, but the author of
"by" chose to output that as a string rather than as a vector... and I'm
too lazy to figure out parsing that at this point. I'm thinking it's
probably easier to determine the order external to "by".
Finally ... my question ... the help for "by" says: "A data frame is
split by row into data frames subsetted by the values of one or more
factors". Should I infer from this that the elements are factorized?
And that the order of the rows would be the same as if we did factor,
with the default options (ie alphabetical)? Further, is that applied
iteratively, with each subgroup broken into the factors for the
remaining indices (which would have the order as if they were
factorized)? And that the order of the data has no bearing on the order
of the results?
hopefully that makes sense.
or if someone else has a better way of getting the job done?
thanks, Daryl Morris
FHCRC, UW
More information about the R-help
mailing list