[R] comparing predicted sequence A'(t) to observed sequence A(t)

Sat Feb 12 16:41:22 CET 2005

Hi,

Thanks for your quick response !

By predicted A(t), I meant B(t) + C(t) - D(t). In other words, how well 
does B(t) + C(t) - D(t) approximate A(t) ?

And all the counts are non-negative.

Regards, Suresh

Spencer Graves wrote:
>      What do you mean by the following:
>      A(t) = B(t) + C(t) - D(t)?
>      Since you speak of regressing predicted against actual A(t), I 
> gather this is not what you mean.
>      Another question:  Do you have numbers <-0 for either predicted or 
> actual A(t)?  If yes but only a very few, I might replace the 0's by 0.5 
> and any negatives by 0.25, take their logarithms, then try acf, pacf, 
> ar, arima(..., xreg=A.pred), etc.
>      There are doubtless better methods.  However, if I had to have an 
> answer today, I think I'd try this, then discuss implications and 
> limitations.  If I needed a more sophisticated answer and I had a few 
> weeks or months to work on it, I might develop some way to simulate a 
> process that seemed to describe what I thought generated these numbers 
> and compare simulated results with actual, under a variety of 
> hypotheses, obtaining various kinds of p-values, etc.
>      hope this helps.      spencer graves
> 
> Suresh Krishna wrote:
> 
>>
>> Hi,
>>
>> I have a question that I have not been succesful in finding a 
>> definitive answer to; and I was hoping someone here could give me some 
>> pointers to the right place in the literature.
>>
>> A. We have 4 sets of data, A(t), B(t), C(t), and D(t). Each of these 
>> consists of a series of counts obtained in sequential time-intervals: 
>> so  for example, A(t) would be something like:
>>
>> Count A(t):  25,    28,    26,   34   ......
>> Time (ms):  0-10, 10-20, 20-30, 30-40 .......
>>
>> Each count in the series A(t) is obtained by summing the total number 
>> of observed counts over multiple (say 50), independent repetitions of 
>> that time-series. These counts are generally known to be Poisson 
>> distributed, and the 4 processes A(t), B(t), C(t) and D(t) are 
>> independent of each other.
>>
>> B. It appears on visual observation that the following relationship 
>> holds; and such a relationship would also be expected on mechanistic 
>> considerations.
>>
>> A(t) = B(t) + C(t) - D(t)
>>
>> We now want to test this hypothesis statistically.
>>
>> Because successive counts in the sequence are likely to be correlated, 
>> isnt it true that none of these methods are valid ? Perhaps for other 
>> reasons as well ?
>>
>> a)Doing a chi-squared test to see if the predicted curve for A(t) 
>> deviates significantly from the observed A(t); this also seems to not 
>> take the variability of the predicted curve into account.
>>
>> b)Doing a regression of the predicted values of A(t) against the 
>> actual values of A(t) and checking for deviations of slope from 1 and 
>> intercept from 0 ? Here, in addition to lack of independence, the fact 
>> that X-values are not fixed (i.e. are variable) and the fact that X 
>> and Y are Poisson distributed counts should also be taken into 
>> account, right ?
>>
>> I would be very grateful if someone could point me to methods to 
>> handle this kind of situation, or where to look for them. Is there 
>> something in the time-series literature, for instance ?
>>
>> Thanks !!
>>
>> Suresh
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide! 
>> http://www.R-project.org/posting-guide.html
> 
> 
>