[R-pkgs] [ANN] plyr version 0.1.7

hadley wickham h.wickham at gmail.com
Thu Apr 16 01:46:19 CEST 2009


plyr is a set of tools for a common set of problems: you need to break
down a big data structure into manageable pieces, operate on each
piece and then put all the pieces back together.  For example, you
might want to:

  * fit the same model to subsets of a data frame
  * quickly calculate summary statistics for each group
  * perform group-wise transformations like scaling or standardising
  * eliminate for-loops in your code

It's already possible to do this with built-in functions (like split
and the apply functions), but plyr just makes it all a bit easier
with:

  * absolutely consistent names, arguments and outputs
  * input from and output to data.frames, matrices and lists
  * progress bars to keep track of long running operations
  * built-in error recovery, and informative error messages

Some considerable effort has been put into making plyr fast and memory
efficient, and in most cases it is faster than the built-in functions.

You can find out more at http://had.co.nz/plyr/, including a 20 page
introductory guide, http://had.co.nz/plyr/plyr-intro.pdf.  You can ask
questions about plyr (and data-manipulation in general) on the plyr
mailing list.  Sign up at http://groups.google.com/group/manipulatr


plyr 0.1.7 (2008-04-15) ---------------------------------------------------

Ensure that rbind.fill preserves attributes.

plyr 0.1.6 (2008-04-15) ---------------------------------------------------

Improvements:

* all ply functions deal more elegantly when given function names: can
supply a vector of function names, and name is used as label in output
* failwith and each now work with function names as well as functions
(i.e. "nrow" instead of nrow)
* each now accepts a list of functions or a vector of function names
* l*ply will use list names where present
* if .inform is TRUE, error messages will give you information about
where errors within your data - hopefully this will make problems
easier to track down

Speed-ups

* massive speed ups for splitting large arrays
* fixed typo that was causing a 50% speed penalty for d*ply
* rewritten rbind.fill is considerably (> 4x) faster for many data frames
* colwise about twice as fast

Bug fixes:

* daply: now works when the data frame is split by multiple variables
* aaply: now works with vectors
* ddply: first variable now varies slowest as you'd expect


plyr 0.1.5 (2008-02-23) ---------------------------------------------------

* colwise now accepts a quoted list as its second argument.  This
allows you to specify the names of columns to work on: colwise(mean,
.(lat, long))
* d_ply and a_ply now correctly pass ... to the function


-- 
http://had.co.nz/



More information about the R-packages mailing list