[R] help with for loop: new column giving count of observation for each SITEID
Bert Gunter
gunter.berton at gene.com
Tue Oct 30 23:39:03 CET 2012
Eek!
Just a bit simpler would be (á la Dr. Dunlap): (d is the data frame):
d <- within(d,index <- ave(year,site, FUN = order))
(This assumes exactlly one data collection per each year that appears, though.)
Cheers,
Bert
On Tue, Oct 30, 2012 at 12:56 PM, arun <smartpink111 at yahoo.com> wrote:
> HI,
>
> You can also use this:res<-do.call(rbind,lapply(split(d,d$site),function(x) data.frame(x,newindex=1:nrow(x))))
> rownames(res)<-1:nrow(res)
> res
> # RchID site year index newindex
> #1 1 A 2002 1 1
> #2 2 A 2004 2 2
> #3 3 A 2005 3 3
> #4 4 B 2003 1 1
> #5 5 B 2006 2 2
> #6 6 B 2008 3 3
> #7 7 C 2002 1 1
> #8 8 C 2003 2 2
> #9 9 C 2004 3 3
> A.K.
>
>
>
> ----- Original Message -----
> From: William Dunlap <wdunlap at tibco.com>
> To: "Meredith, Christy S -FS" <csmeredith at fs.fed.us>
> Cc: "r-help at r-project.org" <r-help at r-project.org>
> Sent: Tuesday, October 30, 2012 3:43 PM
> Subject: Re: [R] help with for loop: new column giving count of observation for each SITEID
>
> Your data was, in R-readable format (from dput())
> d <- data.frame(
> RchID = 1:9,
> site = factor(c("A", "A", "A", "B", "B", "B", "C",
> "C", "C"), levels = c("A", "B", "C")),
> year = c(2002L, 2004L, 2005L, 2003L, 2006L, 2008L,
> 2002L, 2003L, 2004L),
> index = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L))
> and I am assuming that 'index' is the desired result. You can use
> withinGroupIndex to make a new column identical to 'index'. There
> are a variety of ways to add that column to an existing data.frame,
> one of which is within():
> > within(d, newIndex <- withinGroupIndex(site))
> RchID site year index newIndex
> 1 1 A 2002 1 1
> 2 2 A 2004 2 2
> 3 3 A 2005 3 3
> 4 4 B 2003 1 1
> 5 5 B 2006 2 2
> 6 6 B 2008 3 3
> 7 7 C 2002 1 1
> 8 8 C 2003 2 2
> 9 9 C 2004 3 3
> Or is 'index' not the desired result?
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>> -----Original Message-----
>> From: Meredith, Christy S -FS [mailto:csmeredith at fs.fed.us]
>> Sent: Tuesday, October 30, 2012 12:20 PM
>> To: William Dunlap
>> Subject: RE: [R] help with for loop: new column giving count of observation for each
>> SITEID
>>
>> Not quite,
>> I need it like this, a new number for each ordered year in the sequence within each site,
>> regardless of what the years are, and to retain the RchID column.
>>
>> RchID site year index
>> 1 A 2002 1
>> 2 A 2004 2
>> 3 A 2005 3
>> 4 B 2003 1
>> 5 B 2006 2
>> 6 B 2008 3
>> 7 C 2002 1
>> 8 C 2003 2
>> 9 C 2004 3
>>
>>
>> Thanks so much for you help!
>>
>>
>> -----Original Message-----
>> From: William Dunlap [mailto:wdunlap at tibco.com]
>> Sent: Tuesday, October 30, 2012 1:07 PM
>> To: Meredith, Christy S -FS; r-help at R-project.org
>> Subject: RE: [R] help with for loop: new column giving count of observation for each
>> SITEID
>>
>> Is this what you want?
>> > withinGroupIndex <- function(group, ...) ave(integer(length(group)), group, ...,
>> FUN=seq_along)
>> > site <- c("A","A","C","D","C","A","B")
>> > data.frame(site, index=withinGroupIndex(site))
>> site index
>> 1 A 1
>> 2 A 2
>> 3 C 1
>> 4 D 1
>> 5 C 2
>> 6 A 3
>> 7 B 1
>>
>> You can add more arguments if the groups depend on more than one value:
>> > year <- rep(c(1985, 2012), c(4,3))
>> > data.frame(site, year, index=withinGroupIndex(site, year))
>> site year index
>> 1 A 1985 1
>> 2 A 1985 2
>> 3 C 1985 1
>> 4 D 1985 1
>> 5 C 2012 1
>> 6 A 2012 1
>> 7 B 2012 1
>>
>> Bill Dunlap
>> Spotfire, TIBCO Software
>> wdunlap tibco.com
>>
>>
>> > -----Original Message-----
>> > From: r-help-bounces at r-project.org
>> > [mailto:r-help-bounces at r-project.org] On Behalf Of Meredith, Christy S
>> > -FS
>> > Sent: Tuesday, October 30, 2012 11:17 AM
>> > To: r-help at R-project.org
>> > Subject: [R] help with for loop: new column giving count of
>> > observation for each SITEID
>> >
>> >
>> > Hello,
>> > I think this is easy, but I can't seem to find a good way to do this
>> > in the R help. I have a list of sites, with multiple years of data for
>> > each site id. I want to create a new column that gives a number
>> > describing whether it is the 1st year ("1" ) the data was collected
>> > for the site, the second year ("2"), etc. I have different years for
>> > each siteid, but I don't care which year it was collected, just the order that it is in for
>> that siteid. This is what I have so far, but it doesn't do the analysis separately for each
>> SiteID.
>> >
>> > indexi<-indexg[order(indexg$SiteID,indexg$Yr),]
>> >
>> > obs=0
>> > indexi=na.omit(indexi)
>> > for(i in 1:length(indexi$SiteID)){
>> > obs=obs+1
>> > indexi$obs[i]=obs
>> > }
>> >
>> >
>> > Thanks for any help you can give.
>> >
>> > Christy Meredith
>> > USDA Forest Service
>> > Rocky Mountain Research Station
>> > PIBO Monitoring
>> > Data Analyst
>> > Voice: 435-755-3573
>> > Fax: 435-755-3563
>> >
>> >
>> >
>> >
>> >
>> > This electronic message contains information generated by the USDA
>> > solely for the intended recipients. Any unauthorized interception of
>> > this message or the use or disclosure of the information it contains
>> > may violate the law and subject the violator to civil or criminal
>> > penalties. If you believe you have received this message in error, please notify the
>> sender and delete the email immediately.
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Bert Gunter
Genentech Nonclinical Biostatistics
Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
More information about the R-help
mailing list