[R] Randomly select elements based on criteria
Petr Savicky
savicky at cs.cas.cz
Fri Mar 23 10:56:11 CET 2012
On Thu, Mar 22, 2012 at 11:42:53AM -0700, aly wrote:
> Hi,
>
> I want to randomly pick 2 fish born the same day but I need those
> individuals to be from different families. My table includes 1787 fish
> distributed in 948 families. An example of a subset of fish born in one
> specific day would look like:
>
> >fish
>
> fam born spawn
> 25 46 43
> 25 46 56
> 26 46 50
> 43 46 43
> 131 46 43
> 133 46 64
> 136 46 43
> 136 46 42
> 136 46 50
> 136 46 85
> 137 46 64
> 142 46 85
> 144 46 56
> 144 46 64
> 144 46 78
> 144 46 85
> 145 46 64
> 146 46 64
> 147 46 64
> 148 46 78
> 149 46 43
> 149 46 98
> 149 46 85
> 150 46 64
> 150 46 78
> 150 46 85
> 151 46 43
> 152 46 78
> 153 46 43
> 156 46 43
> 157 46 91
> 158 46 42
>
> Where "fam" is the family that fish belongs to, "born" is the day it was
> born (in this case day 46), and "spawn" is the day it was spawned. I want to
> know if there is a correlation in the day of spawn between fish born the
> same day but that are unrelated (not from the same family).
> I want to randomly select two rows but they have to be from different fam.
> The fist part (random selection), I got it by doing:
>
> > ran <- sample(nrow (fish), size=2); ran
>
> [1] 9 12
>
> > newfish <- fish [ran,]; newfish
>
> fam born spawn
> 103 136 46 50
> 106 142 46 85
>
> In this example I got two individuals from different families (good) but I
> will repeat the process many times and there's a chance that I get two fish
> from the same family (bad):
>
> > ran<-sample (nrow(fish), size=2);ran
>
> [1] 26 25
>
> > newfish <-fish [ran,]; newfish
>
> fam born spawn
> 127 150 46 85
> 126 150 46 78
>
> I need a conditional but I have no clue on how to include it in the code.
Hi.
Try the following.
ran1 <- sample(nrow(fish), 1)
ind <- which(fish$fam != fish$fam[ran1])
ran2 <- ind[sample(length(ind), 1)]
fish[c(ran1, ran2), ]
This generates the pairs from exactly the same distribution as
the rejection method suggested earlier, however, it does not
contain a loop.
Hope this helps.
Petr Savicky.
More information about the R-help
mailing list