[R] help with for loop: new column giving count of observation for each SITEID
Meredith, Christy S -FS
csmeredith at fs.fed.us
Thu Nov 1 21:20:38 CET 2012
Thanks to you all for your help with this code Basically, the purpose of this was to create a column detailing whether it is the 1st, 2nd, 3rd, etc time that the site was sampled. I need this for creating a graph which shows this new variable on the x axis versus a variable of interest. The variable of interest can change but refers to aspects of fish habitat.
Indexg$newindex refers to this new variable (perhaps not the best names, but they are indices of habitat quality and time). Ultimately, I want to create a ggplot graph with the new value on the x axis and the variable of interest in the y axis (code also shown below). I have been able to make the graph, but not all of my series are showing on the graph (only the first 5). I am not sure why this is the case. So now I have another question. Do any of you know why all the series are not showing on my graph and how to fix this?
Thanks
Christy
#Code for new column nth time sampled
indexg=read.csv("indexg.csv")
indexg=data.frame(indexg)
indexi<-indexg[order(indexg$SiteID,indexg$Yr),]
res<-do.call(rbind,lapply(split(indexi,indexi$SiteID),function(x) data.frame(x,newindex=1:nrow(x))))
rownames(res)<-1:nrow(res)
res
res$SiteID=as.factor(res$SiteID)
q=ggplot(data=res,aes(x=newindex,y=Bf,group=SiteID,shape=SiteID))+geom_line() + geom_point(size=5,colour="black")
Data:
SiteID Yr Bf newindex
1 1015 2001 3.77 1
2 1015 2006 4.94 2
3 1015 2011 5.20 3
4 1035 2003 5.84 1
5 1035 2008 6.18 2
6 1039 2003 4.41 1
7 1039 2008 5.24 2
8 1047 2001 NA 1
9 1047 2003 4.76 2
10 1047 2004 4.10 3
11 1047 2006 5.85 4
12 1047 2008 4.87 5
13 1047 2009 5.59 6
14 1047 2010 4.69 7
15 1047 2011 4.94 8
16 1088 2003 3.75 1
17 1088 2008 5.86 2
18 1104 2004 7.32 1
19 1104 2009 7.21 2
20 1106 2001 6.92 1
21 1106 2002 4.25 2
22 1106 2003 4.75 3
23 1106 2004 6.67 4
24 1106 2005 4.50 5
25 1106 2006 6.62 6
26 1106 2008 6.32 7
27 1106 2009 6.30 8
28 1106 2010 6.65 9
29 1106 2011 5.51 10
30 1110 2004 6.87 1
31 1110 2009 5.53 2
32 2702 2009 1.80 1
33 2944 2010 4.36 1
34 2946 2010 2.25 1
-----Original Message-----
From: arun [mailto:smartpink111 at yahoo.com]
Sent: Tuesday, October 30, 2012 1:57 PM
To: Meredith, Christy S -FS
Cc: R help; William Dunlap
Subject: Re: [R] help with for loop: new column giving count of observation for each SITEID
HI,
You can also use this:res<-do.call(rbind,lapply(split(d,d$site),function(x) data.frame(x,newindex=1:nrow(x))))
rownames(res)<-1:nrow(res)
res
# RchID site year index newindex
#1 1 A 2002 1 1
#2 2 A 2004 2 2
#3 3 A 2005 3 3
#4 4 B 2003 1 1
#5 5 B 2006 2 2
#6 6 B 2008 3 3
#7 7 C 2002 1 1
#8 8 C 2003 2 2
#9 9 C 2004 3 3
A.K.
----- Original Message -----
From: William Dunlap <wdunlap at tibco.com>
To: "Meredith, Christy S -FS" <csmeredith at fs.fed.us>
Cc: "r-help at r-project.org" <r-help at r-project.org>
Sent: Tuesday, October 30, 2012 3:43 PM
Subject: Re: [R] help with for loop: new column giving count of observation for each SITEID
Your data was, in R-readable format (from dput())
d <- data.frame(
RchID = 1:9,
site = factor(c("A", "A", "A", "B", "B", "B", "C",
"C", "C"), levels = c("A", "B", "C")),
year = c(2002L, 2004L, 2005L, 2003L, 2006L, 2008L,
2002L, 2003L, 2004L),
index = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L)) and I am assuming that 'index' is the desired result. You can use withinGroupIndex to make a new column identical to 'index'. There are a variety of ways to add that column to an existing data.frame, one of which is within():
> within(d, newIndex <- withinGroupIndex(site))
RchID site year index newIndex
1 1 A 2002 1 1
2 2 A 2004 2 2
3 3 A 2005 3 3
4 4 B 2003 1 1
5 5 B 2006 2 2
6 6 B 2008 3 3
7 7 C 2002 1 1
8 8 C 2003 2 2
9 9 C 2004 3 3
Or is 'index' not the desired result?
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: Meredith, Christy S -FS [mailto:csmeredith at fs.fed.us]
> Sent: Tuesday, October 30, 2012 12:20 PM
> To: William Dunlap
> Subject: RE: [R] help with for loop: new column giving count of
> observation for each SITEID
>
> Not quite,
> I need it like this, a new number for each ordered year in the
>sequence within each site, regardless of what the years are, and to retain the RchID column.
>
> RchID site year index
> 1 A 2002 1
> 2 A 2004 2
> 3 A 2005 3
> 4 B 2003 1
> 5 B 2006 2
> 6 B 2008 3
> 7 C 2002 1
> 8 C 2003 2
> 9 C 2004 3
>
>
> Thanks so much for you help!
>
>
> -----Original Message-----
> From: William Dunlap [mailto:wdunlap at tibco.com]
> Sent: Tuesday, October 30, 2012 1:07 PM
> To: Meredith, Christy S -FS; r-help at R-project.org
> Subject: RE: [R] help with for loop: new column giving count of
> observation for each SITEID
>
> Is this what you want?
> > withinGroupIndex <- function(group, ...)
>ave(integer(length(group)), group, ...,
> FUN=seq_along)
> > site <- c("A","A","C","D","C","A","B")
> > data.frame(site, index=withinGroupIndex(site))
> site index
> 1 A 1
> 2 A 2
> 3 C 1
> 4 D 1
> 5 C 2
> 6 A 3
> 7 B 1
>
> You can add more arguments if the groups depend on more than one value:
> > year <- rep(c(1985, 2012), c(4,3))
> > data.frame(site, year, index=withinGroupIndex(site, year))
> site year index
> 1 A 1985 1
> 2 A 1985 2
> 3 C 1985 1
> 4 D 1985 1
> 5 C 2012 1
> 6 A 2012 1
> 7 B 2012 1
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
> > -----Original Message-----
> > From: r-help-bounces at r-project.org
> > [mailto:r-help-bounces at r-project.org] On Behalf Of Meredith, Christy
> > S -FS
> > Sent: Tuesday, October 30, 2012 11:17 AM
> > To: r-help at R-project.org
> > Subject: [R] help with for loop: new column giving count of
> > observation for each SITEID
> >
> >
> > Hello,
> > I think this is easy, but I can't seem to find a good way to do this
> > in the R help. I have a list of sites, with multiple years of data
> > for each site id. I want to create a new column that gives a number
> > describing whether it is the 1st year ("1" ) the data was collected
> > for the site, the second year ("2"), etc. I have different years for
> > each siteid, but I don't care which year it was collected, just the
> > order that it is in for
> that siteid. This is what I have so far, but it doesn't do the
> analysis separately for each SiteID.
> >
> > indexi<-indexg[order(indexg$SiteID,indexg$Yr),]
> >
> > obs=0
> > indexi=na.omit(indexi)
> > for(i in 1:length(indexi$SiteID)){
> > obs=obs+1
> > indexi$obs[i]=obs
> > }
> >
> >
> > Thanks for any help you can give.
> >
> > Christy Meredith
> > USDA Forest Service
> > Rocky Mountain Research Station
> > PIBO Monitoring
> > Data Analyst
> > Voice: 435-755-3573
> > Fax: 435-755-3563
> >
> >
> >
> >
> >
> > This electronic message contains information generated by the USDA
> > solely for the intended recipients. Any unauthorized interception of
> > this message or the use or disclosure of the information it contains
> > may violate the law and subject the violator to civil or criminal
> > penalties. If you believe you have received this message in error,
> > please notify the
> sender and delete the email immediately.
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list