[R] lapply with data frame

Noah Silverman noah at smartmediacorp.com
Sun Feb 28 03:49:40 CET 2010


I'm a bit confused on how to use lapply with a data.frame.

For example.

lapply(data, function(x) print(x))

WHAT exactly is passed to the function.  Is it each ROW in the data 
frame, one by one, or each column, or the entire frame in one shot?

What I want to do apply a function to each row in the data frame.  Is 
lapply the right way.

A second application is to normalize a column value by group.  For 
example, if I have the following table:
id    group    value      norm
1    A            3.2
2    A            3.0
3    A            3.1
4    B            5.5
5    B            6.0
6    B            6.2
etc...

The long version would be:
foreach (group in unique(data$group)){
     data$norm[group==group] <- data$value[group==group] / 
sum(data$value[group==group])
}

There must be a faster way to do this with lapply.  (Ideally, I'd then 
use mclapply to run on multi-cores and really crank up the speed.)

Any suggestions?



More information about the R-help mailing list