[R] Variance of multiple non-contiguous time periods?
cjohndavies at gmail.com
Tue Nov 4 14:50:25 CET 2014
On 04/11/14 09:11, Jim Lemon wrote:
> On Mon, 3 Nov 2014 12:45:03 PM CJ Davies wrote:
>> On 30/10/14 21:33, Jim Lemon wrote:
>> If I understand, you mean to calculate deviations for each individual
>> 'chunk' of each transition & then aggregate the results? This is what
>> I'd been thinking about, but is there a sensible manner within R to
>> achieve this, or is it something for which it would be easier to
>> preprocess the data in an external tool? Is there some way to subset
>> data such that I can work over just contiguous 'chunks'?
> Exactly. If there is some combination of existing variables that can be
> combined to make a set of unique values for each "chunk", you can
> calculate the deviations within each "chunk", then average the squared
> deviations for each type of "chunk", weighting by the duration of the
> "chunks" so that you don't bias the pooled variance toward the longer
I am stumped for a way of automating this process though. Each line of
log data looks like this;
2406 55.4 (-11.2, 1.0, -0.9) (-4.1, 1.0, 0.0) 7.077912 0.9203392 (0.0,
0.7, -0.1, 0.7) 8.129684 89.41537 -8.212769 (0.0, 0.7, -0.1, 0.7)
8.129684 89.41537 351.7872 1 0 0 False 0.15 3 37.76761 True False 0
Where the last variable defines which transition is currently active.
However to separate these data into 'chunks' would involve making a
comparison between each line of data & the preceding line of data to
determine whether it is part of the same contiguous 'chunk'. Is this
something that would be better achieved using external preprocessing
written in a language I am more familiar with, as I haven't the foggiest
how I would approach this within R?
More information about the R-help