[R] Loess over split data
Luc Villandre
villandl at dms.umontreal.ca
Thu Apr 23 21:56:17 CEST 2009
Dear R users,
I am having trouble devising an efficient way to run a loess() function
on all columns of a data.frame (with the x factor remaining the same for
all columns) and I was hoping that someone here could help me fix my
code so that I won't have to resort to using a for loop.
(You'll find the upcoming lines of code in a single block near the end
of the message.)
Here's a small sample of my data :
my.data=data.frame(birth.vec=c(274, 259, 288, 284, 276, 274, 284, 288,
273, 278), final.weight= c(2609.328, 2955.701, 4159.837, 4591.258,
5144.559, 2877.159, 4384.498, 3348.454, 3323.391, 2918.055)) ;
"birth.vec" is in days and "final.weight" is in grams. Now, what I'm
trying to do is to output smoothed curves for the 10th, 50th and 90th
percentiles over completed weeks. In other words, I transform
"birth.vec" into a factor vector of completed weeks with
completed.weeks = as.factor(floor(my.data$birth.vec/7)) ;
I use these factors to get weight quantiles by completed weeks with
quan.list = tapply(my.data$final.weight,INDEX =
completed.weeks,FUN=quantile,probs=c(0.1,0.5,0.9)) ;
I then collapse the list into a data.frame with
quan.frame = data.frame(do.call(rbind,quan.list)) ;
Now comes the time to apply the loess() function to each percentile
curve (i.e. to each column in quan.frame). Note that the x values for
all percentile columns (i.e. X10., X50., X90.)
x.values = as.numeric(rownames(quan.frame)) ;
Once again, using tapply() (and transposing beforehand):
t.quan.frame = t(quan.frame) ;
## The following command doesn't work
smooth.result =
tapply(t.quan.frame,INDEX=as.factor(rep(1:nrow(t.quan.frame),each=ncol(t.quan.frame))),FUN=loess)
;
The "formula" argument of loess() is of course missing, since I have no
idea how I could specify it in a way that the function could understand it.
Now, for your convenience, here's the code patched together:
########
my.data=data.frame(birth.vec=c(274, 259, 288, 284, 276, 274, 284, 288,
273, 278), final.weight= c(2609.328, 2955.701, 4159.837, 4591.258,
5144.559, 2877.159, 4384.498, 3348.454, 3323.391, 2918.055)) ;
completed.weeks = as.factor(floor(my.data$birth.vec/7)) ;
quan.list = tapply(my.data$final.weight,INDEX =
completed.weeks,FUN=quantile,probs=c(0.1,0.5,0.9)) ;
quan.frame = data.frame(do.call(rbind,quan.list)) ;
x.values = as.numeric(rownames(quan.frame)) ;
t.quan.frame = t(quan.frame) ;
## The following command doesn't work
smooth.result =
tapply(t.quan.frame,INDEX=as.factor(rep(1:nrow(t.quan.frame),each=ncol(t.quan.frame))),FUN=loess)
;
#######
So, is there a way to modify this code so that I'll get smoothed
percentile curves? Any suggestion as to how it could be improved is
also welcome.
Although the forum archive is fairly comprehensive, phrasing search
queries in a way that will yield the desired result is sometimes hard,
hence this request for help that to some of you will most likely seem
redundant.
I thank you sincerely for your time,
Luc
More information about the R-help
mailing list