[R] lapply with data frame

David Winsemius dwinsemius at comcast.net
Sun Feb 28 04:03:52 CET 2010


On Feb 27, 2010, at 9:49 PM, Noah Silverman wrote:

> I'm a bit confused on how to use lapply with a data.frame.
>
> For example.
>
> lapply(data, function(x) print(x))
>
> WHAT exactly is passed to the function.  Is it each ROW in the data  
> frame,

No.

> one by one, or each column,

Yes. Dataframes are lists of columns.

> or the entire frame in one shot?
>
> What I want to do apply a function to each row in the data frame.   
> Is lapply the right way.

No. Use apply(dtfrm, 1, ......)

>
> A second application is to normalize a column value by group.

Which is, as you suggested, a different problem for which apply()  
would not be particularly useful because you have a group. Hence  
tapply or one of its variants, aggregate() or by() would be used:

For your example, I am guessing that:

tapply(dfrm$value, dtrm$group, sum)

... might be more economical (at least in single core practice.)


-- 
David

>  For example, if I have the following table:
> id    group    value      norm
> 1    A            3.2
> 2    A            3.0
> 3    A            3.1
> 4    B            5.5
> 5    B            6.0
> 6    B            6.2

I could not quite figure out how that might have been printed on a  
console,  since there are more variable names than columns????

> etc...

Yes. I do think there is more than you are revealing.

>
> The long version would be:

foreach is not a base function:

> foreach (group in unique(data$group)){
>    data$norm[group==group] <- data$value[group==group] / sum(data 
> $value[group==group])
> }
>
> There must be a faster way to do this with lapply.  (Ideally, I'd  
> then use mclapply to run on multi-cores and really crank up the  
> speed.)

Learn your basics first. libraries or packages need to be specified.

>
> Any suggestions?
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list