[R] Problem in selecting rows in R

Bert Gunter bgunter.4567 at gmail.com
Sun Mar 20 17:08:00 CET 2016


Your example is wrong: What happened to Sp2 on 1/29?

You have also apparently mixed up lower and upper case: "Sp1", "SP3" .
This will likely cause you great grief, so try to avoid or fix this in
your work.

Anyway, there are tons of ways to  do this. dplyR is particularly good
at this sort of thing, I believe, so you might want to learn it.
However, I tend to use just base R, in which it is also pretty
straightforward.

1) I assume "time period" = Month. Please be clear in what you mean in
future. To get the final result neatly arranged, you could use date
functions, after first converting your Time column to POSIX. You could
then use the month() function to get the month, properly orderd.
However, as this is a bit complicated, I'll just use brute force to
order the Month factor manually:

dat1$Month <- with(dat1,ordered(as.character(Month),lev=c("Jan","Feb","March","April")))

2) Then this does what you want I think (there are more elegant ways,
certainly):

do.call(rbind, with(dat1,by(dat1, list(Species,Month), FUN
=function(x)x[nrow(x),])))


(You can use the order() function to order the data frame by Species
if you want to do this)


Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Mar 20, 2016 at 7:15 AM, Kristi Glover
<kristi.glover at hotmail.com> wrote:
> Hi R Users,
> Some individuals recorded multiple within a time period. But, I want to select the row of last site within each time period for each individual. I spent a substantial time, but no luck in selecting the rows. Would you give me a hint for this one? I have a very large data set, but this is just an example.
> Thanks for your help.
>
> I want to get dat2 from dat1.
>
> dat1<-structure(list(sn = 1:16, Species = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 5L, 5L, 5L, 5L, 5L), .Label = c("Sp1",
> "Sp2", "SP3", "Sp4", "Sp5"), class = "factor"), Observed.site = structure(c(1L, 3L, 1L, 2L, 1L, 2L, 4L, 3L, 1L, 2L, 2L, 3L, 5L, 4L, 1L, 3L), .Label = c("SiteA",
> "SiteB", "SiteC", "SiteD", "SIteD"), class = "factor"), Time = structure(c(1L, 5L, 6L, 6L, 6L, 2L, 3L, 11L, 8L, 10L, 1L, 1L, 4L, 4L, 7L, 9L), .Label = c("1/15/15",
> "1/17/15", "1/29/15", "2/17/15", "2/25/15", "2/27/15", "2/28/15", "3/27/15", "3/5/15", "4/19/15", "7/3/15"), class = "factor"),
>     Month = structure(c(3L, 2L, 2L, 2L, 2L, 3L, 2L, 4L, 4L, 1L, 3L, 3L, 2L, 2L, 2L, 4L), .Label = c("April", "Feb", "Jan",
>     "March"), class = "factor")), .Names = c("sn", "Species", "Observed.site", "Time", "Month"), class = "data.frame", row.names = c(NA, -16L))
> dat1
> #---
> dat2<-structure(list(sn = c(1L, 5L, 6L, 9L, 10L, 11L, 12L, 15L, 16L), Species = structure(c(1L, 1L, 2L, 3L, 3L, 4L, 5L, 5L, 5L), .Label = c("Sp1",
> "Sp2", "SP3", "Sp4", "Sp5"), class = "factor"), Observed.site = structure(c(1L, 1L, 2L, 1L, 2L, 2L, 3L, 1L, 3L), .Label = c("SiteA", "SiteB",
> "SiteC"), class = "factor"), Time = structure(c(1L, 3L, 2L, 5L, 7L, 1L, 1L, 4L, 6L), .Label = c("1/15/15", "1/17/15", "2/27/15",
> "2/28/15", "3/27/15", "3/5/15", "4/19/15"), class = "factor"), Month = structure(c(3L, 2L, 3L, 4L, 1L, 3L, 3L, 2L, 4L), .Label = c("April",
>     "Feb", "Jan", "March"), class = "factor")), .Names = c("sn", "Species", "Observed.site", "Time", "Month"), class = "data.frame", row.names = c(NA, -9L))
> dat2
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list