[R] Formatting data for bootstrapping for confidence intervals
arun
smartpink111 at yahoo.com
Fri Oct 12 00:31:24 CEST 2012
Hi,
Try this:
dat1<-read.table(text="
Area NAME DATE X Xn Y
1 X 1/10/10 1 1 0
1 Y 1/11/10 0 0 1
1 X 1/12/10 1 0 0
1 X 1/12/10 1 0 0
1 X 1/12/10 1 0 0
2 X 2/12/10 1 1 0
2 X 2/12/10 1 0 0
2 Y 2/12/10 0 0 1
2 X 2/13/10 1 0 0
2 X 2/13/10 1 0 0
2 X 2/13/10 1 0 0
2 X 2/14/10 1 0 0
2 X 2/14/10 1 0 0
2 X 2/14/10 1 1 0
2 X 2/14/10 1 0 0
3 X 7/27/11 1 0 0
3 X 7/27/11 1 1 0
3 X 7/27/11 1 0 0
3 X 7/28/11 1 0 0
3 X 7/28/11 1 1 0
3 X 7/28/11 1 0 0
3 X 7/28/11 1 0 0
3 Y 7/28/11 0 0 1
3 X 7/28/11 1 0 0
3 X 7/28/11 1 1 0
3 Y 7/28/11 0 0 1
3 X 7/28/11 1 0 0
3 X 7/29/11 1 0 0
3 X 7/29/11 1 0 0
3 X 7/29/11 1 1 0
",sep="",header=TRUE,stringsAsFactors=FALSE)
#You can either use aggregate(), ddply() from library(plyr) or using library(data.table)
library(data.table)
dat2<-data.table(dat1)
dat2[,list(X=sum(X),Xn=sum(Xn),Y=sum(Y)),list(Area,DATE)]
# Area DATE X Xn Y
#1: 1 1/10/10 1 1 0
#2: 1 1/11/10 0 0 1
#3: 1 1/12/10 3 0 0
#4: 2 2/12/10 2 1 1
#5: 2 2/13/10 3 0 0
#6: 2 2/14/10 4 1 0
#7: 3 7/27/11 3 1 0
#8: 3 7/28/11 7 2 2
#9: 3 7/29/11 3 1 0
library(plyr)
ddply(dat1,.(Area,DATE),colwise(sum,c("X","Xn","Y")))
# Area DATE X Xn Y
#1 1 1/10/10 1 1 0
#2 1 1/11/10 0 0 1
#3 1 1/12/10 3 0 0
#4 2 2/12/10 2 1 1
#5 2 2/13/10 3 0 0
#6 2 2/14/10 4 1 0
#7 3 7/27/11 3 1 0
#8 3 7/28/11 7 2 2
#9 3 7/29/11 3 1 0
A.K.
----- Original Message -----
From: Paul Wennekes <paul.wennekes at evobio.eu>
To: r-help at r-project.org
Cc:
Sent: Thursday, October 11, 2012 11:55 AM
Subject: [R] Formatting data for bootstrapping for confidence intervals
Hi all,
New to R, so this may be obvious to some.
I've been trying to figure this out for a while, I have a dataset "events"
that looks something like this:
Area NAME DATE X Xn Y
1 X 1/10/10 1 1 0
1 Y 1/11/10 0 0 1
1 X 1/12/10 1 0 0
1 X 1/12/10 1 0 0
1 X 1/12/10 1 0 0
2 X 2/12/10 1 1 0
2 X 2/12/10 1 0 0
2 Y 2/12/10 0 0 1
2 X 2/13/10 1 0 0
2 X 2/13/10 1 0 0
2 X 2/13/10 1 0 0
2 X 2/14/10 1 0 0
2 X 2/14/10 1 0 0
2 X 2/14/10 1 1 0
2 X 2/14/10 1 0 0
3 X 7/27/11 1 0 0
3 X 7/27/11 1 1 0
3 X 7/27/11 1 0 0
3 X 7/28/11 1 0 0
3 X 7/28/11 1 1 0
3 X 7/28/11 1 0 0
3 X 7/28/11 1 0 0
3 Y 7/28/11 0 0 1
3 X 7/28/11 1 0 0
3 X 7/28/11 1 1 0
3 Y 7/28/11 0 0 1
3 X 7/28/11 1 0 0
3 X 7/29/11 1 0 0
3 X 7/29/11 1 0 0
3 X 7/29/11 1 1 0
X and Y are events. Every row represents a single event happening, with a 1
indicating which one happens at that time. Xn indicates X happening at
night. I want to bootstrap these events over days but I think I need to
summarize them first, ie. get something that looks like this:
Area DATE X Xn Y
1 1/10/10 1 1 0
1 1/11/10 0 0 1
1 1/12/10 3 0 0
2 2/12/10 2 1 1
etc.
and then for each Area, bootstrap the data over the days. Any ideas? I've
tried using the 'reshape' package but I don't know how to sum over parts of
the columns as defined by the DATE values...
Many thanks ahead!
--
View this message in context: http://r.789695.n4.nabble.com/Formatting-data-for-bootstrapping-for-confidence-intervals-tp4645860.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list