[R] sampling dataframe based upon number of record occurrences

JS Huang js.huang at protective.com
Wed Mar 4 02:13:28 CET 2015


Here is an implementation with function named getSample. Some modification to
the data was made so that it can be read as a table.

> fitting.set
   IDbyYear             SiteID Year
1     42.24          A-Airport 2006
2     42.24          A-Airport 2006
3     42.24          A-Airport 2006
4     42.24          A-Airport 2006
5     42.24          A-Airport 2006
6     42.24          A-Airport 2006
7     45.32 A-Bark.Corral.East 2008
8     45.32 A-Bark.Corral.East 2008
9     45.36 A-Bark.Corral.East 2009
10    45.40 A-Bark.Corral.East 2010
11    45.40 A-Bark.Corral.East 2010
> getSample
function(x)
{
  sites <- unique(x$SiteID)
  years <- unique(x$Year)
  result <- data.frame()
  x$ID <- seq(1,nrow(x))
  for (i in 1:length(sites))
  {
    for (j in 1:length(years))
    {
      if (nrow(x[as.character(x$SiteID)==as.character(sites[i]) &
x$Year==years[j],]) > 3)
      {
        sampledID <- sample(x[as.character(x$SiteID)==as.character(sites[i])
& x$Year==years[j],]$ID,3,replace=FALSE)
        for (k in 1:length(sampledID))
        {
          result <- rbind(result,x[x$ID==sampledID[k],-4])
        }          
      }
    }
  }
  names(result) <- c("IDbyYear","SiteID","Year")
  rownames(result) <- NULL
  return(result)
}
> getSample(fitting.set)
  IDbyYear    SiteID Year
1    42.24 A-Airport 2006
2    42.24 A-Airport 2006
3    42.24 A-Airport 2006



--
View this message in context: http://r.789695.n4.nabble.com/sampling-dataframe-based-upon-number-of-record-occurrences-tp4704144p4704154.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list