[R] Probably a good use for apply
ROLL Josh F
JRoll at lcog.org
Fri Jun 1 01:25:05 CEST 2012
Yes you are correct. I want need to change my sample number specification to the number of elements in the vector.
So sampleWorker function should be:
sampleWorker <- function(x) return(sample(c(TRUE,FALSE),length(x), replace = TRUE, prob = c(x, 1-x)))
So this is where I get a little confused with using apply functions. Isnt x each element of each vector. So in the sample data I provide there are 4 x's, and each would be put into the sampleWorker function using the lapply.
#sample data
test_<- list(a=c(.85,.10),b=c(.99,.05))
To show what I want without using a list of vectors and instead just a single one see below:
IsWorker.Hh_ <- lapply(c(.9,.1) , sampleWorker)
#Returns:
[[1]]
[1] TRUE
[[2]]
[1] FALSE
Now I just need to run through each vector of the list I specify, in this case test_. Then I need to sum the TRUES for each vector. So again if we assume the test_ data would result in a single TRUE for each vector (because of the .85 and .99 probabilities) the result would be
> IsWorker_
$a
[1] 1
$b
[1] 1
Perhaps lapply isnt the right tool? I have seen a couple of comments on the list that say the plyr package is easy to figure out but you lose out on speed and that is my issue right now. I can do what I need to do using some for loops but its way way too slow. Any guidance is appreciated. Thanks guys
Josh
-----Original Message-----
From: Sarah Goslee [mailto:sarah.goslee at gmail.com]
Sent: Thursday, May 31, 2012 1:35 PM
To: ROLL Josh F
Cc: r-help at r-project.org
Subject: Re: [R] Probably a good use for apply
Hi,
On Thu, May 31, 2012 at 1:08 PM, LCOG1 <jroll at lcog.org> wrote:
> This is great thank you. I think I am getting the hang of some of the
> apply functions. I am stuck again however. I have list test_ below
> and would like to apply the sample function using each element of each
> vector as the probability and return a TRUE or FALSE that I will
> ultimately sum the TRUES by vector.
>
> test_<- list(a=c(.85,.10),b=c(.99,.05)) #Write a function to sample
> based on labor force participation rates to determine presence of
> workers in household sampleWorker <- function(x)
> return(sample(c(TRUE,FALSE),x, replace = TRUE, prob = c(x, 1-x)))
Your first problem is that sampleWorker() doesn't run with a single component of test_ so it can't possibly run in an apply statement.
Please reread ?sample - the second argument is the size of the desired sample, but what you are passing is a non-integer vector of length 2.
What do you actually want this to be?
Then for prob, you're passing c(x, 1-x)) but x is again a non-integer vector of length 2, so that results in a vector of length 4, which is longer than the number of options sample() is choosing from.
Do you perhaps want to pass only a single probability at a time? But even then you need to resolve the size problem.
Sarah
> IsWorker.Hh_ <- lapply(test , sampleWorker)
>
> I am doing something wrong with the setup becuase i am getting an
> error about specifying probabilities incorrectly.
>
> The result I am looking for for IsWorker_ to be (assuming the .85,
> and . 99 probabilities 'win' from each vector and the lower values do not.
>
>> IsWorker_
> $a
> [1]TRUE
> $b
> [1]TRUE
>
> but ultimately I will need to sum the TRUEs for each vector
>
>> IsWorker_
> $a
> [1] 1
> $b
> [1] 1
>
>
> Thanks
>
> Josh
>
--
Sarah Goslee
http://www.functionaldiversity.org
More information about the R-help
mailing list