[R-pkgs] [ANN] plyr version 0.1.7
hadley wickham
h.wickham at gmail.com
Thu Apr 16 01:46:19 CEST 2009
plyr is a set of tools for a common set of problems: you need to break
down a big data structure into manageable pieces, operate on each
piece and then put all the pieces back together. For example, you
might want to:
* fit the same model to subsets of a data frame
* quickly calculate summary statistics for each group
* perform group-wise transformations like scaling or standardising
* eliminate for-loops in your code
It's already possible to do this with built-in functions (like split
and the apply functions), but plyr just makes it all a bit easier
with:
* absolutely consistent names, arguments and outputs
* input from and output to data.frames, matrices and lists
* progress bars to keep track of long running operations
* built-in error recovery, and informative error messages
Some considerable effort has been put into making plyr fast and memory
efficient, and in most cases it is faster than the built-in functions.
You can find out more at http://had.co.nz/plyr/, including a 20 page
introductory guide, http://had.co.nz/plyr/plyr-intro.pdf. You can ask
questions about plyr (and data-manipulation in general) on the plyr
mailing list. Sign up at http://groups.google.com/group/manipulatr
plyr 0.1.7 (2008-04-15) ---------------------------------------------------
Ensure that rbind.fill preserves attributes.
plyr 0.1.6 (2008-04-15) ---------------------------------------------------
Improvements:
* all ply functions deal more elegantly when given function names: can
supply a vector of function names, and name is used as label in output
* failwith and each now work with function names as well as functions
(i.e. "nrow" instead of nrow)
* each now accepts a list of functions or a vector of function names
* l*ply will use list names where present
* if .inform is TRUE, error messages will give you information about
where errors within your data - hopefully this will make problems
easier to track down
Speed-ups
* massive speed ups for splitting large arrays
* fixed typo that was causing a 50% speed penalty for d*ply
* rewritten rbind.fill is considerably (> 4x) faster for many data frames
* colwise about twice as fast
Bug fixes:
* daply: now works when the data frame is split by multiple variables
* aaply: now works with vectors
* ddply: first variable now varies slowest as you'd expect
plyr 0.1.5 (2008-02-23) ---------------------------------------------------
* colwise now accepts a quoted list as its second argument. This
allows you to specify the names of columns to work on: colwise(mean,
.(lat, long))
* d_ply and a_ply now correctly pass ... to the function
--
http://had.co.nz/
More information about the R-packages
mailing list