[R] tapply with multiple arguments that are not part of the same data frame

Kavitha Venkatesan kavitha.venkatesan at gmail.com
Thu Oct 22 22:50:24 CEST 2009


I just realized my earlier post of my question below was not in
"Plain" Text mode, hence the repeat post...apologies!
Kavitha


On Thu, Oct 22, 2009 at 4:19 PM, Kavitha Venkatesan
<kavitha.venkatesan at gmail.com> wrote:
> Hi all,
>
> I would like to invoke a function that takes multiple arguments (some of
> which are specified columns in the data frame, and others that are
> independent of the data frame) on split parts of a data frame, how do I do
> this?
>
> For example, let's say I have a data frame
>>fitness_data
> name  height  weight  country
> rob      5.8        200      usa
> nancy  5.5        140      germany
> jen       5.6        150      usa
> clark     5.10      210     germany
> matt     5.9         280     canada
> ralph    6           270     canada
> ...
> ...
>
> Now let us say I have a function,  my_func(h, w, noise, dir), which takes as
> input:
> (1) a vector of heights
> (2) a vector of weights
> (3) a user-input numeric "noise" value
> (4) a user-input string "dir" for the directory to output the end result of
> the function to
>
> This function does some calculations on the input data and outputs a
> dataframe that is then written to a file in the "dir" directory.
>
> If I want to apply this function to data grouped by each country in the
> "fitness_data" dataframe, how would I do this? I tried looking through the
> mailing archives, but couldn't nail down the solution. I tried something
> like
>
> split(mapply( function(a,b,c,d) my_func(fitness_data$h, fitness_data$w, 2.5,
> my_directory)), fitness_data$country)
>
> but this considered fitness_data$h, and fitness_data$w in each single row
> for a country, rather than a vector of heights or weights across all rows
> corresponding to that country.
>
> Thanks!
>
>




More information about the R-help mailing list