[R] tapply with multiple arguments that are not part of the same data frame
Kavitha Venkatesan
kavitha.venkatesan at gmail.com
Thu Oct 22 22:50:24 CEST 2009
I just realized my earlier post of my question below was not in
"Plain" Text mode, hence the repeat post...apologies!
Kavitha
On Thu, Oct 22, 2009 at 4:19 PM, Kavitha Venkatesan
<kavitha.venkatesan at gmail.com> wrote:
> Hi all,
>
> I would like to invoke a function that takes multiple arguments (some of
> which are specified columns in the data frame, and others that are
> independent of the data frame) on split parts of a data frame, how do I do
> this?
>
> For example, let's say I have a data frame
>>fitness_data
> name height weight country
> rob 5.8 200 usa
> nancy 5.5 140 germany
> jen 5.6 150 usa
> clark 5.10 210 germany
> matt 5.9 280 canada
> ralph 6 270 canada
> ...
> ...
>
> Now let us say I have a function, my_func(h, w, noise, dir), which takes as
> input:
> (1) a vector of heights
> (2) a vector of weights
> (3) a user-input numeric "noise" value
> (4) a user-input string "dir" for the directory to output the end result of
> the function to
>
> This function does some calculations on the input data and outputs a
> dataframe that is then written to a file in the "dir" directory.
>
> If I want to apply this function to data grouped by each country in the
> "fitness_data" dataframe, how would I do this? I tried looking through the
> mailing archives, but couldn't nail down the solution. I tried something
> like
>
> split(mapply( function(a,b,c,d) my_func(fitness_data$h, fitness_data$w, 2.5,
> my_directory)), fitness_data$country)
>
> but this considered fitness_data$h, and fitness_data$w in each single row
> for a country, rather than a vector of heights or weights across all rows
> corresponding to that country.
>
> Thanks!
>
>
More information about the R-help
mailing list