[R] Using the 'by' function withing a 'for' loop

Judith Flores juryef at yahoo.com
Mon Apr 21 19:40:42 CEST 2008


Dear R experts,

    I am trying to optimize my script, because right
now it requires a lot of memory. The goal is to
generate four plots in one page. Every plot
corresponds to the means and sem's calculated for a
given variable at different days. In order to obtain
the means and sem's I apply the 'by' function. The way
I have done it so far is like this:

Read the data
Generate a summary of the mean and sem of a variable
at every Day.
Plot the mean and sem of that variable.

Repeat the same process for the other 3 variables.

  I tried to optimize the code by using a for loop,
the code is below.

  

#Reading the data
dato<-read.csv('mydata.csv')
names(dato)<-c("id","day","tx","var1","var2","var3","var4")
dato<-dato[,1:7]

#Specify varible to be plotted
variable<-c('var1','var2','var3','var4')

#Define parameters of window where panel: margins,
number of plots in the panel
windows(height=9, width=9, rescale='fixed')
par(mfrow=c(2,2),xpd=T, bty='l',
omi=c(0.8,0.25,1.2,0.15), mai=c(1.1,0.8,0.3,0.3))


for (k in variable) {
    
    dat<-dato[!is.na(k),]



    summ<-by(dat,dat[,c("tx","day")], function(x) {
        mn<-mean(x$k)
        std<-sd(x$k)
        n<-length(x$k)
        se<-std/sqrt(n)
        lowb<-mn-se
        upb<-mn+se
       
data.frame(tx=x$tx[1],day=x$day[1],mn=mn,std=std,lowb=lowb,upb=upb,se=se)
        })
    summ<-do.call("rbind",summ)
  
    


    #Definining x axis range
    xmax<-unique(max(summ$day,na.rm=TRUE))
    xmin<-unique(min(summ$day,na.rm=TRUE))
    
    yaxmin<-unique(min(summ$lowb))
    yaxmax<-unique(max(summ$upb))


plot(1,1,type='n',xlab='Day',xlim=c(xmin,xmax),ylim=c(yaxmin,yaxmax),
ylab=k,
       
las=1,cex.lab=1,xaxp=c(xmin,xmax,diff(range(c(xmin,xmax)))))
        points(summ$day,summ$mn)

}        
        



    Where variable is a vector that specifies all the
variables I want to plot.

But I am getting the following error:

“Error in var(as.vector(x), na.rm = na.rm) : 'x' is
empty
In addition: Warning message:
In mean.default(x$k) : argument is not numeric or
logical: returning NA”

   Could some one please show me how to structure my
code to achieve my final goal, which is to simplify
it?

I am attaching a csv file in case you want to run my
code.

Thank you very much in advance for your time and help,

Judith



      ____________________________________________________________________________________
Be a better friend, newshound, and 


More information about the R-help mailing list