[R] ref card for data manipulation?
hadley wickham
h.wickham at gmail.com
Thu Dec 11 15:19:03 CET 2008
>> You (as many before you) have overlooked the ave() function, which can
>> replace the ordering as well the do.call(c,tapply(....))
>>
>
> Majority of questions on this list concern data manipulation. Many are
> repetitive. "Overlooking" like that will always happen unless some
> comprehensive data manipulation documentation is made.
> I think many people would benefit if a specialized data.manip ref.card were
> conceived.
I like the idea, but is a reference card really enough? To me, what
most people need to tackle data manipulation problems is a broad
strategy, not a list of useful functions. plyr is a codification of
my most recent ideas on one such strategy: splitting a big data
structure into smaller pieces, applying a function to each piece and
then joining them back together. Just recognising your problem can be
solved with this strategy is a big step forward, the functions in plyr
just save you some typing and a bit of thought compared to doing it in
base R.
Recognising this strategy has helped me in my own data manipulation
problems - many tasks with which I used to struggle are now easy to
solve, not just because of plyr, but because I have a framework in
which to think about the problem. But this is just one strategy and
there must be many more common strategies waiting to be identified. I
think working on this would be time better spent - describing a
strategy gives people the tools to help themselves. (Of course this
doesn't help the people who just want canned answers, but I'm less
interested in helping them)
Hadley
--
http://had.co.nz/
More information about the R-help
mailing list