[R-sig-ME] discrete random effect query

Tue Mar 2 18:01:47 CET 2010

Dear all,

I have a piece of equipment which produces several time series of data 
simultaneously, for the sake of an example, two lines with the same 
slope but (randomly) different intercepts

    set.seed(17)
    time<-seq(0,24,length.out=241)
    dat1<-time + 5 + rnorm(length(time), sd=0.4)
    dat2<-time + 6 + rnorm(length(time), sd=0.4)

However, for some physical reason that I don't care about and can't seem 
to cure, a substantial (and variable between experiments) minority of 
readings at particular time points (again, which is variable between 
experiments), in all time series are displaced to vary around a slightly 
different line (say 2 away) apparently at random:

    affected<-sample(c(0,1),replace=TRUE, size=length(time), 
prob=c(0.8,0.2))
    dat1<-affected*2+dat1
    dat2<-affected*2+dat2

This seems to be a random effect of reading number and I'd like to model 
it as such. My issue is that this is a discrete effect, clearly not 
drawn from a normal distribution and I don't know if it's possible to 
fit such a random effect, let alone how to do it (particularly when 
there's also the random effect of the different time series to worry about).

What I'm currently doing amounts to fitting a model ignoring the effect, 
eye-balling and putting a line through the residuals to invent a new 
fixed effect (probablyAffected) and then including that in the model:

datA<-data.frame(rep(time,2),c(dat1,dat2),factor(c(rep("dat1",241),rep("dat2",241))))
    names(datA)<-c("time","dat","series")
    library(nlme)  
    lme1<-lme(fixed= dat~time, random= ~1 | series, data=datA)
    plot(resid(lme1)[1:241],resid(lme1)[242:482])
    abline(1.4,-1)
    probablyAffected<-rep((resid(lme1)[1:241]+resid(lme1)[242:482])>1.4, 2)
    lme2<-lme(fixed= dat~time+probablyAffected, random= ~1 | series, 
data=datA)

This more or less works, though in reality, there are more time series 
and more complex curves than this example, so where to draw the line 
becomes more like guesswork (and/or a clustering problem) and the 
influence of this effect may be more complex than simply adding a value 
to the intercept, which also makes things less clear.

I feel I may be missing something obvious, but does anyone have 
suggestions for a better way to go about this?

Any suggestions much appreciated,

Chris

-- 
------------------------------------------------------------------------
Dr Christopher Knight                             Michael Smith Building
Wellcome Trust RCD Fellow                       Faculty of Life Sciences
Tel:  +44 (0)161 2755378                    The University of Manchester
room B.2012                                                  Oxford Road
tinyurl.com/knightLab/                                Manchester M13 9PT
· . ,,><(((°>                                                         UK