[R] help with for loop: new column giving count of observation for each SITEID

Thu Nov 1 21:20:38 CET 2012

Thanks to you all for your help with this code  Basically, the purpose of this was to create a column detailing whether it is the 1st, 2nd, 3rd, etc time that the site was sampled. I need this for creating a graph which shows this new variable on the x axis versus a variable of interest. The variable of interest can change but refers to aspects of fish habitat. 

 Indexg$newindex refers to this new variable (perhaps not the best names, but they are indices of habitat quality and time). Ultimately, I want to create a ggplot graph with the new value on the x axis and the variable of interest in the y axis (code also shown below). I have been able to make the graph, but not all of my series are showing on the graph (only the first 5). I am not sure why this is the case.  So now I have another question. Do any of you know why all the series are not showing on my graph and how to fix this?

Thanks
Christy 

#Code for new column nth time sampled

indexg=read.csv("indexg.csv")
indexg=data.frame(indexg)

indexi<-indexg[order(indexg$SiteID,indexg$Yr),]

res<-do.call(rbind,lapply(split(indexi,indexi$SiteID),function(x) data.frame(x,newindex=1:nrow(x))))
rownames(res)<-1:nrow(res)
res

res$SiteID=as.factor(res$SiteID)
q=ggplot(data=res,aes(x=newindex,y=Bf,group=SiteID,shape=SiteID))+geom_line()  + geom_point(size=5,colour="black") 

Data:
SiteID   Yr   Bf newindex
1    1015 2001 3.77        1
2    1015 2006 4.94        2
3    1015 2011 5.20        3
4    1035 2003 5.84        1
5    1035 2008 6.18        2
6    1039 2003 4.41        1
7    1039 2008 5.24        2
8    1047 2001   NA        1
9    1047 2003 4.76        2
10   1047 2004 4.10        3
11   1047 2006 5.85        4
12   1047 2008 4.87        5
13   1047 2009 5.59        6
14   1047 2010 4.69        7
15   1047 2011 4.94        8
16   1088 2003 3.75        1
17   1088 2008 5.86        2
18   1104 2004 7.32        1
19   1104 2009 7.21        2
20   1106 2001 6.92        1
21   1106 2002 4.25        2
22   1106 2003 4.75        3
23   1106 2004 6.67        4
24   1106 2005 4.50        5
25   1106 2006 6.62        6
26   1106 2008 6.32        7
27   1106 2009 6.30        8
28   1106 2010 6.65        9
29   1106 2011 5.51       10
30   1110 2004 6.87        1
31   1110 2009 5.53        2
32   2702 2009 1.80        1
33   2944 2010 4.36        1
34   2946 2010 2.25        1

-----Original Message-----
From: arun [mailto:smartpink111 at yahoo.com] 
Sent: Tuesday, October 30, 2012 1:57 PM
To: Meredith, Christy S -FS
Cc: R help; William Dunlap
Subject: Re: [R] help with for loop: new column giving count of observation for each SITEID

HI,

You can also use this:res<-do.call(rbind,lapply(split(d,d$site),function(x) data.frame(x,newindex=1:nrow(x))))
 rownames(res)<-1:nrow(res)
 res
#  RchID site year index newindex
#1     1    A 2002     1        1
#2     2    A 2004     2        2
#3     3    A 2005     3        3
#4     4    B 2003     1        1
#5     5    B 2006     2        2
#6     6    B 2008     3        3
#7     7    C 2002     1        1
#8     8    C 2003     2        2
#9     9    C 2004     3        3
A.K.

----- Original Message -----
From: William Dunlap <wdunlap at tibco.com>
To: "Meredith, Christy S -FS" <csmeredith at fs.fed.us>
Cc: "r-help at r-project.org" <r-help at r-project.org>
Sent: Tuesday, October 30, 2012 3:43 PM
Subject: Re: [R] help with for loop: new column giving count of observation for each SITEID

Your data was, in R-readable format (from dput())
  d <- data.frame(
       RchID = 1:9,
       site = factor(c("A", "A", "A", "B", "B", "B", "C",
          "C", "C"), levels = c("A", "B", "C")),
       year = c(2002L, 2004L, 2005L, 2003L, 2006L, 2008L,
          2002L, 2003L, 2004L),
       index = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L)) and I am assuming that 'index' is the desired result.  You can use withinGroupIndex to make a new column identical to 'index'.  There are a variety of ways to add that column to an existing data.frame, one of which is within():
  > within(d, newIndex <- withinGroupIndex(site))
    RchID site year index newIndex
  1     1    A 2002     1        1
  2     2    A 2004     2        2
  3     3    A 2005     3        3
  4     4    B 2003     1        1
  5     5    B 2006     2        2
  6     6    B 2008     3        3
  7     7    C 2002     1        1
  8     8    C 2003     2        2
  9     9    C 2004     3        3
Or is 'index' not the desired result?

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: Meredith, Christy S -FS [mailto:csmeredith at fs.fed.us]
> Sent: Tuesday, October 30, 2012 12:20 PM
> To: William Dunlap
> Subject: RE: [R] help with for loop: new column giving count of 
> observation for each SITEID
> 
> Not quite,
>  I need it like this, a new number for each ordered year in the 
>sequence within each site,  regardless of what the years are,  and to retain the RchID column.
> 
> RchID    site    year    index
> 1    A    2002    1
> 2    A    2004    2
> 3    A    2005    3
> 4    B    2003    1
> 5    B    2006    2
> 6    B    2008    3
> 7    C    2002    1
> 8    C    2003    2
> 9    C    2004    3
> 
> 
> Thanks so much for you help!
> 
> 
> -----Original Message-----
> From: William Dunlap [mailto:wdunlap at tibco.com]
> Sent: Tuesday, October 30, 2012 1:07 PM
> To: Meredith, Christy S -FS; r-help at R-project.org
> Subject: RE: [R] help with for loop: new column giving count of 
> observation for each SITEID
> 
> Is this what you want?
>   > withinGroupIndex <- function(group, ...) 
>ave(integer(length(group)), group, ...,
> FUN=seq_along)
>   > site <- c("A","A","C","D","C","A","B")
>   > data.frame(site, index=withinGroupIndex(site))
>     site index
>   1    A     1
>   2    A     2
>   3    C     1
>   4    D     1
>   5    C     2
>   6    A     3
>   7    B     1
> 
> You can add more arguments if the groups depend on more than one value:
>   > year <- rep(c(1985, 2012), c(4,3))
>   > data.frame(site, year, index=withinGroupIndex(site, year))
>     site year index
>   1    A 1985     1
>   2    A 1985     2
>   3    C 1985     1
>   4    D 1985     1
>   5    C 2012     1
>   6    A 2012     1
>   7    B 2012     1
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
> > -----Original Message-----
> > From: r-help-bounces at r-project.org
> > [mailto:r-help-bounces at r-project.org] On Behalf Of Meredith, Christy 
> > S -FS
> > Sent: Tuesday, October 30, 2012 11:17 AM
> > To: r-help at R-project.org
> > Subject: [R] help with for loop: new column giving count of 
> > observation for each SITEID
> >
> >
> > Hello,
> > I think this is easy, but I can't seem to find a good way to do this 
> > in the R help. I have a list of sites, with multiple years of data 
> > for each site id. I want to create a new column that gives a number 
> > describing whether it is the 1st year ("1" ) the data was collected 
> > for the site, the second year ("2"), etc. I have different years for 
> > each siteid, but I don't care which year it was collected, just the 
> > order that it is in for
> that siteid.  This is what I have so far, but it doesn't do the 
> analysis separately for each SiteID.
> >
> > indexi<-indexg[order(indexg$SiteID,indexg$Yr),]
> >
> > obs=0
> > indexi=na.omit(indexi)
> > for(i in 1:length(indexi$SiteID)){
> > obs=obs+1
> > indexi$obs[i]=obs
> > }
> >
> >
> > Thanks for any help you can give.
> >
> > Christy Meredith
> > USDA Forest Service
> > Rocky Mountain Research Station
> > PIBO Monitoring
> > Data Analyst
> > Voice: 435-755-3573
> > Fax: 435-755-3563
> >
> >
> >
> >
> >
> > This electronic message contains information generated by the USDA 
> > solely for the intended recipients. Any unauthorized interception of 
> > this message or the use or disclosure of the information it contains 
> > may violate the law and subject the violator to civil or criminal 
> > penalties. If you believe you have received this message in error, 
> > please notify the
> sender and delete the email immediately.
> >
> >     [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.